Connect your moderator Slack workspace to receive post notifications:
Sign in with Slack

K cross validation vs split train test

Hi,
In the 4th lecture, it is written that K fold cross validation returns an unbiased estimation of the generalization error. What does it mean ? Can we also find a bound on the generalization error which depends on the number of samples ? Is the split train-test biased ?
Thank you

Top comment

Regarding your first question, let me explain how I understand it: K-Folds CV is an estimator of the generalized error. It's an unbiased estimator meaning the expected value of the estimator is the same as the generalization error (The value we try to estimate).

Second question. I'm pretty sure you can see this derived from the hoeffding's inequality in the lecture about model selection (04a). You see that |S| (Size of distribution sample) decreases the error.

Using a single train/split will be biased in my understanding. Note that a single train/split is more or less a 1-fold cross-validator, so a bad estimator which will most likely have a bias.

https://en.wikipedia.org/wiki/Bias_of_an_estimator

Page 1 of 1

Add comment

Post as Anonymous Dont send out notification