Connect your moderator Slack workspace to receive post notifications:
Sign in with Slack

Q29 exam 2020

Hello,

I was wondering first: why training only the first layer is a non-convex problem?
And second: if we are training only the last layer, this should be equivalent no?

Thank you for your answer :)

Top comment

If you train only the last layer, you can treat the first layers as a 'feature extractor'. You are really learning a linear model on top of those features, which is a convex problem.

If you train only the first layer, the story is different. The 'loss' function that is the composition of the later layers and the normal loss, can be a very complex non-convex function.

Hope this helps!

Okay thank you it is clear! So you confirm that the answer would be true if we would have "training only the last layer" instead of "training only the first layer"?

Page 1 of 1

Add comment

Post as Anonymous Dont send out notification