Hi, TA. I would like to ask why the partial derivative of w1 is not changed, I did not understand the answer.
Up again! Would like an answer to this one too. :)
The partial derivative is defined as dL/dw1 for the given operating point. Saying w2=w3 in weight sharing just means that dL/dw2 and dL/dw3 is really 1+1=2 sicne you can't change one of the parameters without also changing the other. But w1 isn't sharing weights with any other so it keeps the original found partial derivative of 1
Hey! Thanks for the answer.
So if I understood correctly, weight sharing causes the partial derivative along the corresponding weight to be the sum of the other partial derivatives affected by weight sharing ?
I think so: the way I see it is suppose you have a function f(x)=x+x but instead you wrote it f(x1,x2)=x1+x2 then df/dx1=1 df/dx2=1 but df/dx=1+1=2 :) TAs correct me if i'm wrong