Q29 exam 2020

This forum is inactive. Browsing/searching possible.

Connect your moderator Slack workspace to receive post notifications:

Hello,

I was wondering first: why training only the first layer is a non-convex problem?
And second: if we are training only the last layer, this should be equivalent no?

Thank you for your answer :)

12 Jan '22 ·

anonymous

Top comment

If you train only the last layer, you can treat the first layers as a 'feature extractor'. You are really learning a linear model on top of those features, which is a convex problem.

If you train only the first layer, the story is different. The 'loss' function that is the composition of the later layers and the normal loss, can be a very complex non-convex function.

Hope this helps!

1

13 Jan '22 ·

Thijs Vogels admin

Okay thank you it is clear! So you confirm that the answer would be true if we would have "training only the last layer" instead of "training only the first layer"?

14 Jan '22 ·

anonymous

Page 1 of 1

Add comment

How to style: strictly use the or click here. E.g., \(\alpha + \beta\) gives (inline) \(\alpha + \beta\). No \(\LaTeX\) preview (yet).