I agree it seems weird at first sight. You can easily drop the \(x_i^2\) because it does not depend on \(\hat y\) but the last term \(\hat y ^2 a_i^2\) seems to depend on \(\hat y\), so why is it legit to drop it?
The answer is simple: \(\hat y \) is not any real number but can only take two values \(\{-1,1\}\), and therefore \( \hat y^2 = 1\) is constant and does not depend on the choice you take for \( \hat y \).
Non robust features
Hello,
in the derivation above i didn't understand why it is possible to drop the \(\hat{y}^{2}a_{i}^{2}\) term.
Thank you.
Hello Mattia,
I agree it seems weird at first sight. You can easily drop the \(x_i^2\) because it does not depend on \(\hat y\) but the last term \(\hat y ^2 a_i^2\) seems to depend on \(\hat y\), so why is it legit to drop it?
The answer is simple: \(\hat y \) is not any real number but can only take two values \(\{-1,1\}\), and therefore \( \hat y^2 = 1\) is constant and does not depend on the choice you take for \( \hat y \).
Have a good Sunday.
Nicolas
1
Add comment