Connect your moderator Slack workspace to receive post notifications:
Sign in with Slack

Pb 13 and 14 exam 2020

Hello,

For the exam 2020, I don't understand question 13 and 14 why the optimal delta* HAS to be colinear with w. How can we deduce that??
And for the question 13, I don't understand why the minimum value is given by the minimum l2-distance to the hyperplane defined by w?

Thank you for your help!

Top comment

Hi,

Starting from your second question: the problem boils down to finding the minimum \(\ell_2\)-distance from the origin (since we minimize \(||\delta||_2\) in the objective) to the hyperplane defined via \(w^T \delta = c\) where \(c = - w^T x\). It's just a translation of the given optimization problem to a known geometric problem from which we can deduce how to solve it.

Coming back to your first question: this geometric view of the problem (e.g., you can try to draw it to get more intuition) should suggest that the optimal \(\delta^*\) should be perpendicular to the hyperplane or, equivalently, collinear with the vector \(w\) that defines the hyperplane. This can be verified via Pythagorean theorem or Cauchy-Schwarz inequality (see more details here, "Why this is the closest point").

I hope that helps.

Best,
Maksym

I am so sorry but I really do not understand why the problem boilds down to finding the minimum l2-distance from the origin to the hyperplane… and why this minimum distance is given by |w_T*x|/||w||2… Please can you help me?

Hi,

So it boils down to finding the minimum L2-distance from the origin to the hyperplane just by the definition of the problem that we aim to solve:
adv_question.jpg

I.e., we minimize the L2-distance \(||\delta||_2\) under the constraint that \(\delta\) should be a point on the hyperplane, i.e. that \(w^T \delta\) = - w^T x) holds.

And why this minimum distance is given by \(|w^T x|/||w||_2\): this is a classical geometrical problem and there are multiple ways to derive this result. In addition to the wikipedia link from above, you can also consult, e.g., this stackexchange question shows multiple ways to derive it.

I hope this further clarifies the problem!

Page 1 of 1

Add comment

Post as Anonymous Dont send out notification