Connect your moderator Slack workspace to receive post notifications:
Sign in with Slack

final exam 2017 problem 4 - k-nn more data

Problem 4 final-exam 2017
Hello, I was wondering why having more data makes k-nn more accurate, doesn't that depdend on the extra data ? Since it wasn't mentioned that the extra data was balanced I thought of a counter example. For a fixed K if we add a lot more data from one of the classes won't it shift the decision boundary to one side, therefore decreasing the accuracy, am I wrong ? I made a drawing to illustrate what I think :

This is assuming the question is what answer is always true. I guess in general yes it's better to have more data, but it's not always better ?
Capture d’écran 2022-01-19 104302.jpgCapture d’écran 2022-01-19 104809.jpg
(sorry for the bad hand-writing - it says new decision boundary on the right )
Thank you for your time,

Top comment

I think that if the data is selected randomly, it preserves the distribution, thus, a knn's performance will be improved.

Page 1 of 1

Add comment

Post as Anonymous Dont send out notification