Connect your moderator Slack workspace to receive post notifications:
Sign in with Slack

CV test accuracy consistently higher than train accuracy

Hello,

I used 5-fold CV to train model using Least squares estimate. I tested varying degrees from 1 to 10 for all features. Yet somehow the test error and training error not only are consistently unchanged even with increasing d from 1 to 10 but the test error was consistently lower than training error. What could explain this? is it necessarily a coding error somewhere in my algorithm or can something else explain this phenomena?
for d 1-10:
training accuracy: [0.744776, 0.744776, 0.744776, 0.744776, 0.744776, 0.744776, 0.744776, 0.744776, 0.744776]
test accuracy: [0.74495, 0.74495, 0.74495, 0.74495, 0.74495, 0.74495, 0.74495, 0.74495, 0.74495]
thanks a lot for your help,

Given that this is posted in the lectures section (please try to be more specific) I assume it has nothing to do with the exercises/project 1.

Looking at the results, the difference is really small:
0.74495 - 0.744776 = 0.000174

It can happen that your test data is slightly ‘easier’ than your training data, explaining the lower test error. But if this train/test accuracy is the same across the folds, I suppose you have an issue in your code.

Hello,

I realized I had an issue with my code because I wasn't updating the X going into CV after polynomial expansion. I fixed it but now I am even more perplexed because the test AND training accuracy went down when I transformed from 1 to 4 degrees in both LPM and Logit model. I thought increasing complexity automatically meant higher accuracy in train data. How increasing complexity doesn't improve train accuracy?

thanks again for your help

Page 1 of 1

Add comment

Post as Anonymous Dont send out notification