Connect your moderator Slack workspace to receive post notifications:
Sign in with Slack

Explanation of the error probability of the bayes classifier in kNN

Good evening

I am sorry but I was unable to understand how to derive the expression for the loss of the Bayes classifier in kNN. I read in the lecture notes and saw that it was related to the MAP estimation... I am not sure to see how x and y interact (through probabilities, which are known, which are unknown). That could explain why I don't understand.

Thanks in advance for your help!

First the Bayes classifier is not related to kNN but to classification.

It is the classifier (function from X to Y) which obtains the lowest classification error possible, in that sense it is the best classifier you can hope for.

Its main drawback is that you cannot compute it (to compute it you would have to know the distribution of the data D). Therefore you cannot use this classifier in practice but it gives you a baseline to which you can compare for toy problems where you know the distribution of the data and it is the classifier we will compare ourselves in our theoretical results.

So to reply to your question. We have a classification problem \( (X,Y) \sim D \). We can decompose this law by first considering the distribution of \( X\sim D_x \) and then the conditional distribution of \(Y|x \sim D_{Y|x} \).
Now we can define the function \(\eta(x) = \mathbb {P}_{Y|x \sim D_{Y|x} } ( y=1 | x) \) which is the probability that the label of \(x\) is 1.

So imagine you know \(D\) and therefore you can compute \( \eta \). How would you do your prediction? You have a \(x \) and you can compute the probability that the associated label is 1 by evaluating the function \(eta\) in x. If the value is larger than .5 that means that the label 1 is more likely than the label 0 and therefore you should predict 1. If the value is smaller than .5, that means that the label 0 is more likely and you should predict 0. This is the best you can hope for and this is exactly the Bayes classifier

$$f_*(x) = \mathbb{1}_{\eta(x) >.5}$$

Now you want to derive the classification loss of this classifier.

$$L (f_*) = \mathbb{P} (y \neq f_*) = \mathbb{E} ( \mathbb{1}_{y \neq f_*(x)}) $$

$$ = \mathbb{E_x} [ \mathbb{E_y} (\mathbb{1}_{y \neq f_*(x)} | x] ) = \mathbb{E_x} [ \mathbb {P}_{Y|x \sim D_{Y|x} } ( y=1 | x) \mathbb{1}_{1 \neq f_*(x)} | x] +(1-\mathbb {P}_{Y|x \sim D_{Y|x} } ( y=1 | x) ) \mathbb{1}_{0 \neq f_*(x)} ] $$

$$ = \mathbb{E_x} [ \eta(x) \mathbb{1}_{\eta(x)<.5} +(1-\eta(x) ) \mathbb{1}_{\eta(x)>.5} ] $$

$$ = \mathbb{E_x} [\min\{ \eta(x) ,1-\eta(x) \}] $$

This is a good exercice and I can do it again during the Q&A if you are interested.

Page 1 of 1

Add comment

Post as Anonymous Dont send out notification