Hello,

I don't understand why we can use the MSE for update instead of the RMSE. Technically, it seems like they would lead to the same solution, but the expressions for the gradients are different, we have a leading constand when differentiating the RMSE.

Thanks for your help

rmse the gradient is not defined at some places which could lead to some problems.

## Use MSE instead of RMSE

