I wonder if my explanation for Q2 and Q3 being "no answer is correct" is because in Q2, we don't know if we have smoothness of the function, hence no theorem tells us how fast we converge. For Q3, we have SGD but with constant stepsize, so again we have no theorem telling us what happens.
Is my reasoning correct ?
Thank you for your help
For Q2 you have no assumptions concerning bounded gradients or smoothness so you cannot indeed use any result seen in class.
For Q3 the same remark applies, furthermore you did not see in class any result concerning the convergence of the last iterate for SGD, but only for a weighted average.
Also regarding Q3, the result on the notes is for a convex combination of the previous iterates and not the last iterate.
Good point, thank you for adding a comment on that!