Serie 10 Problem 2.1 Retrieving delta*


I have a question for the last exercise serie problem 2.1. I have been able to get the following equation for the adverserial problem like the solutionary,
\( min_w \frac{1}{n} \sum_{i=1}^n l(y \cdot w^t x - \epsilon ||w||_2) \)
however I am not sure how to get \(\delta^*\) and \(\hat{x}^*\) from this. Sorry if it is straightforward, however i just do not see it - and also do not understand in the solutionary which equality is referred to in order to obtain it!

Many thanks


Thanks a lot for the question. I think this reference can be helpful:
In particular Theorem 16 therein that discusses when do we have equality in the Cauchy-Schwartz inequality: when vector \(v\) is a scalar multiple of \(w\) or vice versa. Applying this to our notation from lab 10 it would mean that we need to select our adversarial perturbation \(\delta\) to be collinear with the weight vector, i.e. \(\delta = \alpha w\) where \(\alpha\) is an appropriately chosen scalar to achieve the equality. In our case, by solving wrt \(\alpha\) one gets \(\alpha = - \frac{y_i \varepsilon}{||w||_2}\) which recovers the desired {\delta^\star).

I hope that helps.

Thank you very much! Have a great weekend

I still don't completely understand how to solve with respect to α. I'm guessing we could use the property in equation 1 (from the Berkeley notes) but I don't understand what <u,v> is. For this example to work, <u,v> would have to be ε. Is that correct? Could you explain in detail how to solve with respect to α? Thank you so much in advance :))

The notation \(\langle u, v\rangle\) denotes the inner product between vectors \(u\), \(v\). For usual Euclidean spaces, it's just \(\langle u, v\rangle = u^T v = \sum_{i=1}^d u_i v_i\).

For this example, it should hold that \(y_i \langle \delta, w \rangle = -\epsilon ||w||_2\). But then we also know that \(\delta^* = \alpha w\), thus we get that \(y_i \langle \alpha w, w \rangle = -\epsilon ||w||_2\) or equivalently

$$\alpha = -\frac{\epsilon ||w||_2}{y_i \langle w, w \rangle} = -\frac{\epsilon ||w||_2}{y_i ||w||_2^2} = -\frac{\epsilon}{y_i ||w||_2} = -\frac{y_i \epsilon}{||w||_2},$$

where we used that \(\langle w, w \rangle = ||w||_2^2\) and \(1/y_i = y_i\) since \(y_i \in \{-1, 1\}\).

I hope that helps.

Page 1 of 1

Add comment

Post as Anonymous Dont send out notification