Connect your moderator Slack workspace to receive post notifications:
Sign in with Slack

Vanishing Gradient

Hello :)

I'm struggling to understand the role of the weights on the vanishing gradient problem.

1) If the weights were not 1, what would be the solution?
IMG_20210108_190756.jpg

2) If the weights were all 1 and we had 4 nodes instead of 4, would there still be a problem?
IMG_20210108_190830.jpg

3) What if we had 40 nodes instead of 3? Would we then have an exploding gradient problem?

Thanks in advance :))

you can not write direct equality those are bounds.

Page 1 of 1

Add comment

Post as Anonymous Dont send out notification