Connect your moderator Slack workspace to receive post notifications:
Sign in with Slack

Analysis of the bayes classifier: scaling everything by a constant

Hi,

I have a question concerning the end of the analysis of the bayes classifier using all features. I don't see how we can rescale everything by \(1/(1 + \mathrm{log}(D))\) and have an equivalent situation. For me, if we start from the last expression of the argmax, if we rescale it, we indeed have noise with smaller variance, but the \(y \in \{\pm 1\}\) becomes \(\tilde{y} \in \{ \pm \frac{1}{1 + \mathrm{log}(D)}\}\), so we don't get anything new. Obviously I am missing something, so it would be great if you could tell me what.

Thanks,
Justin

Hi Justin,

Great question! Maybe I have been a little clumsy in my explanation and I hope it is better explained in the lecture note.

Let's start by what we have at the end of page 12:

$$ \arg\max_{\hat y\in \{ -1,+1\}} \hat y y (1 + \log(D)) + \hat y Z \text{ with } Z \sim \mathcal{N} (0, 1+\log(D) ) $$

Therefore by rescaling by \(1/(1+\log(D)) \) we obtain

$$ \arg\max_{\hat y\in \{ -1,+1\}} \hat y (y+ \tilde Z) \text{ with } \tilde Z \sim \mathcal{N} (0, 1/(1+\log(D)) ) $$

When the dimension becomes large the variance of this Gaussian noise will become very small and the sign of \(y+ \tilde Z\) will be the same as the sign of \(y\) with high probability. Therefore by taking the argmax you will have \( \hat y = sign (y+ \tilde Z ) = y\) and you will recover the correct \(y\).

Is it clearer now?

Best,
Nicolas

Hi,

Thanks a lot for your answer. It is perfectly clear now, I was not careful enough while reading the notes, everything makes sense.

Best,
Justin

Page 1 of 1

Add comment

Post as Anonymous Dont send out notification