If we had no prior p(X=0)=1/3, could we say that the estimator that minimizes the probability is the MLE? Or would the correct answer be MAP estimator with uniform prior?

If the MAP estimator minimizes the probability of error, when would we use MLE? Only when there is no prior? Is there an advantage of using MLE estimator instead of MAP? Is there something that MLE does better?

MAP and MLE are equivalent when the prior is uniform. Otheriwise, the MAP rule yields a smaller error probability, but you often cannot use it in practice because it requires the prior to be known. Therefore, people use MLE instead, as it yields the true parameter given enough samples.

But here since p(one of the two choices for X)=1/3, p(other choice)=2/3, so the prior is not uniform, is it ? Or is the prior proba p(Y) which makes the info about X irrelevant for our case ?

