Connect your moderator Slack workspace to receive post notifications:
Sign in with Slack

[Splitting train/val/test]

Hi there!

I have a small question regarding the train/val/test split. I've looked into some answers online but I am still not convinced. The question is simple : Why don't we train a model (generally a NN) using a bigger train set that contains the train+val set and validate our model using the test set ?
With this strategy we have :

  • More data
  • Control on at which parameters the model performs the best on the test set

Thank you

This strategy is problematic since in this case you would explicitly optimize over the test set. Namely, you won't optimize the parameters of the model on the test set which is good, but you will still optimize the hyperparameters of your model. And then if you have a large number of hyperparameters (e.g., imagine doing a grid search over the learning rate, regularization parameter, number of units in your network, etc), then it can happen just by chance then using some lucky set of hyperparameters you can achieve a very low value of the test error, while this will not generalize to unseen data.

That's why a 3-way split is necessary: one to optimize the main parameters of the model, one to optimize the hyperparameters, and one to report the final performance of the final training scheme.

I hope this helps.

It perfectly makes sense ! THank you veyr much

Page 1 of 1

Add comment

Post as Anonymous Dont send out notification