### Optimizing over the space of rank 1 matrices is equivalent to the space of matrices with nuclear norm <= 1

Good afternoon,

I have a few questions regarding homework 8 part 2, where we want to find the projection on $$X$$. I am not sure to understand clearly why it is equivalent to minimize on matrices with nuclear norm $$\leq$$ 1. I must say, I never met the nuclear norm before, so maybe I missed a key property here.
The part where we take the SVD of the matrix C confuses me a bit. When we rewrite $$C = \sum_i \sigma_i u_i v_i^\top$$, I first thought we did so because $$C$$ is a convex combination of the elements of $$X$$, but we also have that $$\sum \sigma_i \leq 1$$. In case we have a convex combination, I believe the sum of "convex coefficients" should be exactly 1, so the reason is probably different. Then I wondered if the fact that $$\sum \sigma_i \leq 1$$ (and not exactly 1) is related to the fact that the rank of matrices in $$X$$ is 1. I remember some property relating the trace and the eigenvalues for symmetric matrices exists, so maybe is it similar for singular values ?

Top comment

First for any rank 1 matrix M which belongs to A, we know that -M also belongs to A, then $$0=\frac{1}{2} M + \frac{1}{2} (-M)$$ is inside space X=conv(A).

For exercise 2, "optimization domain X is the unit ball of trace norm (or nuclear norm)" is given so you can directly convert $$C\in X$$ to $$||C||_*\le1$$. This is because the nuclear norm is defined as the sum of singular values. Consider the form $$C=\sum_i\sigma_iu_iv_i^\top$$. If $$||C||_* =\sum_i\sigma_i= 1$$, then $$C\in \mathcal{A} \subset X$$; if $$||C||_*=\sum_i\sigma_i < 1$$, then $$C= \sum_i\sigma_iu_iv_i^\top + \frac{1-\sum_i\sigma_i}{2} M + \frac{1-\sum_i\sigma_i}{2} (-M)$$ shows $$C\in X$$. We can also prove that for all $$C\in X$$ indicates $$||C||_*\le1$$ in a similar way.

For the second question, we considered the case when the input matrix S is already in the set X. So $$\sum\sigma_i \le 1$$ is used instead of $$\sum\sigma_i = 1$$. Note that in the section 3.5 of the lecture notes excluded this trivial case from Fact 3.6.

Ok, I see now, I forgot about the line stating the fact in the exercice... Thank you explaining me the equivalence, now I understand better

Page 1 of 1