Connect your moderator Slack workspace to receive post notifications:
Sign in with Slack

Cost Function: MSE /2 (?)

Hello :)

On last week lecture we defined MSE as:

$$ \operatorname{MSE}(\mathbf{w}):=\frac{1}{N} \sum_{n=1}^{N}\left[y_{n}-f_{\mathbf{w}}\left(\mathbf{x}_{n}\right)\right]^{2} $$

Problem Set 2, however, defines MSE the following way:

$$ \mathcal{L}\left(w_{0}, w_{1}\right)=\frac{1}{2 N} \sum_{n=1}^{N}\left(y_{n}-f\left(x_{n 1}\right)\right)^{2}=\frac{1}{2 N} \sum_{n=1}^{N}\left(y_{n}-w_{0}-w_{1} x_{n 1}\right)^{2} $$

Are both equations equivalent? What is the effect of dividing MSE by 2? Is it easier to optimize or does is it only look nicer?

The constant doesn’t matter – you will still find the same optimal weight(s). It is often used with 1/2 as it cancels out nicely when you take the gradients.

Page 1 of 1

Add comment

Post as Anonymous Dont send out notification