Q11 exam 2016

This forum is inactive. Browsing/searching possible.

Connect your moderator Slack workspace to receive post notifications:

Hello,
Can you explain why c) and d) are correct in this question? I don't understand what are "optimal representation points" or "optimal cluster"?
Thank you a lot!
Screenshot 2022-01-13 at 10.02.21.jpg

13 Jan '22 ·

anonymous

Top comment

A K-means solution consists of two things:

k cluster centers (representation points), and
assignments of the original data to those clusters (optimal clusters).

A solution is optimal if it minimizes some particular cost function.

We usually do clustering by alternating between optimizing the cluster centers for a fixed cluster assignment, and optimizing the assignment for fixed cluster centers. The algorithm finishes as soon as either of those doesn't change anymore. The two cases (c) and (d) correspond to the last steps of this algorithm.

13 Jan '22 ·

Thijs Vogels admin

Thank you so much for your answer.
But in the lesson it is said that you initialize K-means with the cluster centers, you don't initialize K-means with the assignment. So that is why I thought that (c) was false :/

13 Jan '22 ·

anonymous

I agree with the question asker!

Our k-means clustering algorithm does:
1) compute the assignment z_nk using centers
2) compute the new centers using new assignments z_nk

For c), if we optimize with optimal clusters but have horrible centers, we will recompute the clusters using the bad centers, so we will not get optimal clusters.

14 Jan '22 ·

anonymous

The question assumes that for c), after initializing with the assignments, you will proceed with re-computing the centers, and not by throwing those away and recomputing the assignments. Does that make sense?

1

17 Jan '22 ·

Thijs Vogels admin

Page 1 of 1

Add comment

How to style: strictly use the or click here. E.g., \(\alpha + \beta\) gives (inline) \(\alpha + \beta\). No \(\LaTeX\) preview (yet).