Explanation on Bayes Formula

This forum is inactive. Browsing/searching possible.

Connect your moderator Slack workspace to receive post notifications:

Explanation on Bayes Formula

Hello, in the 2018 midterm, question 2, we are asked to write a probabilistic model such that the solution coincides to the MAP. In the solutions we deduce from Bayes Formula that

$ \mathbf{w}_{\mathrm{MAP}}^{\star}=\arg \max _{\mathbf{w}} p(\mathbf{y} \mid \mathbf{X}, \mathbf{w}) p(\mathbf{w}) $

Which, I guess results from this:
$ p(\mathbf{w}|\mathbf{X},\mathbf{y}) \propto p(\mathbf{y} \mid \mathbf{X}, \mathbf{w}) p(\mathbf{w}) $

However when we apply bayes formula to the LHS, we don't come up with the same RHS, in particular we get:
$ p(\mathbf{w}|\mathbf{X},\mathbf{y}) = \frac{ p(\mathbf{y}, \mathbf{X} \mid \mathbf{w})p(\mathbf{w})}{p(\mathbf{y},\mathbf{X})} $

As you can see the numerator is not the same, can someone explain the steps more concisely?

26 Dec '20 ·

anonymous

Let's start by writing out Bayes' Theorem:

$$\begin{align} p(\mathbf{w}|\mathbf{X},\mathbf{y}) &= \frac{ p(\mathbf{y}, \mathbf{X} \mid \mathbf{w})p(\mathbf{w})}{p(\mathbf{y},\mathbf{X})}\\ &= \frac{ p(\mathbf{y}\mid \mathbf{X}, \mathbf{w})p(\mathbf{X} \mid \mathbf{w})p(\mathbf{w})}{p(\mathbf{y},\mathbf{X})}\\ &= \frac{ p(\mathbf{y}\mid \mathbf{X}, \mathbf{w})p(\mathbf{X})p(\mathbf{w})}{p(\mathbf{y},\mathbf{X})} \end{align}$$

In the last step, we use the fact that the likelihood of the input features $\mathbf{X}$ does not depend on the model parameters $\mathbf{w}$. I.e. $p(\mathbf{X} \mid \mathbf{w}) = p(\mathbf{X})$.

Now taking the MAP, we get:

$$\begin{align} \mathbf{w}_{\mathrm{MAP}}^{\star} &= \arg \max _{\mathbf{w}} p(\mathbf{w}|\mathbf{X},\mathbf{y})\\ &= \arg \max _{\mathbf{w}} \frac{ p(\mathbf{y}\mid \mathbf{X}, \mathbf{w})p(\mathbf{X})p(\mathbf{w})}{p(\mathbf{y},\mathbf{X})}\\ &= \arg \max _{\mathbf{w}} p(\mathbf{y}\mid \mathbf{X}, \mathbf{w})p(\mathbf{w}) \end{align}$$

In the last step we leave out all factors which are not influenced by the model parameters $\mathbf{w}$.

3

26 Dec '20 · 5 ·

anonymous

Its perfectly clear thank you!

1

27 Dec '20 ·

anonymous

Will Bayes Theorem and Bayes Net be on this exam?

27 Dec '20 ·

anonymous

@Anonymous said:
Will Bayes Theorem and Bayes Net be on this exam?

Yes, Bayes Theorem is certainly material to study for the exam. In relation to this topic, it is used for example in the interpretation of Ridge Regression as an MAP estimator in Lecture 3: https://github.com/epfml/ML_course/blob/master/lectures/03/lecture03d_ridge.pdf

For Bayesian Networks, the answer will be given in the (existing) topic here: http://oknoname.herokuapp.com/forum/topic/485/exam-bayes-net-2019/

1

27 Dec '20 ·

anonymous

The interpretation of Ridge Regression as an MAP estimator is in the lecture notes but was not covered during the lecture.
Should we study everything in the additional notes? Or can we assume that it is extra material?

1

30 Dec '20 ·

anonymous

This topic has been moved.

I have the same question as the last commenter ^isn't it considered extra material. But also I am confused as to when we should use MAP and when we should use MLE. For example in least squares probabilistic approach we don't consider the prior when calculating the likelihood

1

6 Jan '21 ·

anonymous

as usual, exam material is only what was covered in the (video) lectures and lab sessions.

but also as usual, some additional bits might help you better understand the lecture materials

2

6 Jan '21 ·

anonymous

Page 1 of 1

Add comment

How to style: strictly use the or click here. E.g., $\alpha + \beta$ gives (inline) $\alpha + \beta$. No $\LaTeX$ preview (yet).