Hello :)
I'm struggling to understand the role of the weights on the vanishing gradient problem.
1) If the weights were not 1, what would be the solution?
2) If the weights were all 1 and we had 4 nodes instead of 4, would there still be a problem?
3) What if we had 40 nodes instead of 3? Would we then have an exploding gradient problem?
Thanks in advance :))
you can not write direct equality those are bounds.
Vanishing Gradient
Hello :)
I'm struggling to understand the role of the weights on the vanishing gradient problem.
1) If the weights were not 1, what would be the solution?
2) If the weights were all 1 and we had 4 nodes instead of 4, would there still be a problem?
3) What if we had 40 nodes instead of 3? Would we then have an exploding gradient problem?
Thanks in advance :))
you can not write direct equality those are bounds.
Add comment