As far as I've understood, reguarization under \(L_1\) or \( L2 \) norm implicitly ignore the offset term \( w_0 \). I was wondering whether this is always done in practice, since it is not mentioned in the slides.
Yes, in practice the bias term rarely get regularized. Because the bias term does not contribute strongly to the complexity of the model. Small bias term wouldn't make the decision surface smoother, for example.
Regularization - Offset term
Hello,
As far as I've understood, reguarization under \(L_1\) or \( L2 \) norm implicitly ignore the offset term \( w_0 \). I was wondering whether this is always done in practice, since it is not mentioned in the slides.
Yes, in practice the bias term rarely get regularized. Because the bias term does not contribute strongly to the complexity of the model. Small bias term wouldn't make the decision surface smoother, for example.
1
Add comment