Exercise 4.5 meaning of question 1

I don't really understand what it means to "Print the top-10 topics as a combination of 10 terms and 10 documents", does that mean that we need to take 10 random terms and 10 random documents and find the top-10 topics associated with those ? Or find the top-10 topics that have the highest combined "score" between the 10 random documents and the 10 random terms ?

Thank you very much

Top comment

None of the topics, terms or documents are to be chosen at random. You should take the top 10 topics with highest importance in representing the corpus, and give two representations for each of them : first as a combination of 10 terms and then as a combination of 10 documents. The terms and documents in the representation should be the 10 highest ones in terms of their importance in representing the topic.

Think about how you can get the importance of each topic, and the importance of each term or document for a particular topic, from the SVD.

Thank you very much, that answers all my interrogations !

Page 1 of 1

Add comment

Post as Anonymous Dont send out notification