Some questions on 2020 exam

-Q4: How come x^4 is not strongly convex?

-Q11: I agree that it satisfies the PL inequality, but I do not see how we can compute the error for such an exercise.

-Q13: Why is the last option not true? Since x is a global min I would expect that delta_f (x) =0

Top comment

I think I can help you, please someone tell it if I'm wrong

Q4 : The intuition behind this is the following : try to draw a curve that touches the \( x^4 \) curve at a certain point \( x \) but stays below it, the more you move \( x \) to the right (or to the left), the more your curve will have to be flat, so you can't decide of a curvature to fix so that this curve is below the \( x^4 \) curve for any \( x \) (the curvature will tend to 0, without ever be 0, otherwise \( x^4 \) would not be strictly convex)

Q11 : It is the relative error, i.e. \( \frac{f(x_t) -f^{*} }{f(x_0) - f^{*}} = (1-\frac{\mu}{dL})^t \), so you just have to know \( \mu = 1/32 \), \( d = 1 \), \( L = 8 \) and \( t = 1000 \) and this gives the result

Q13 : Beware that the problem might be constrained and thus the global minimum is not necessarily a stationary point

Thanks for your help!
I know about the "visual" way to check if a function is strongly convex, but it does not seem like a very robust way to answer to them ( I keep messing them up). I would like to know if there is a better way from some TA hopefully.

And regarding the PL question, I thought that the error formula you mentioned was only applicable to coord. descent, but I suppose I was wrong.

You can look at the second derivative since the function is twice differentiable, there is no \( \mu > 0 \) such that \( (x^4)'' = 12x^2 \geq \mu \) for all \( x \in \mathbb{R} \).

For the PL question, for \( d=1 \), you can see gradient descent as a coordinate descent that will always sample the same coordinate.

Q11 : It is the relative error, i.e. \( \frac{f(x_t) -f^{*} }{f(x_0) - f^{*}} = (1-\frac{\mu}{dL})^t \), so you just have to know \( \mu = 1/32 \), \( d = 1 \), \( L = 8 \) and \( t = 1000 \) and this gives the result

How do you pass from the bound in terms of \( \| x_0 - x^*\|^2 \) of gradient decent to a bound in terms of \( f(x_0) - f(x^*) \) for the ratio you mention in Q11?

Hi, for strongly convex, can you explain why we need to check the second derivative to see whether it is greater or equal to \mu? Thank you.

Page 1 of 1

Add comment

Post as Anonymous Dont send out notification