Connect your moderator Slack workspace to receive post notifications:
Sign in with Slack

Vanishing Gradient

Hello :)

I'm struggling to understand the role of the weights on the vanishing gradient problem.

1) If the weights were not 1, what would be the solution?

2) If the weights were all 1 and we had 4 nodes instead of 4, would there still be a problem?

3) What if we had 40 nodes instead of 3? Would we then have an exploding gradient problem?

Thanks in advance :))

you can not write direct equality those are bounds.

Page 1 of 1

Add comment

Post as Anonymous Dont send out notification