### Question 18

In exam 2016, question 18, I don't understand why "Comparing the word feature representations from bag-of-words vs GloVe, bag-of-words typically gives lower dimensional representations." is not a correct answer?

In GloVe, dim(W) = DxK
In bag-of-words, dim(W) = 1xK

So I would have conclude that it is true...

bag of words is not 1xK but a one hot encoding so the size is the size of the vocabulary

Ok, thank you!

But i thought that in glove we are also dealing with the co occurrence matrix which has a dimension of V x V (where V is the size of the vocabulary), is this not correct?

Well, technically co occurence matrix is of size $$V_1 \times V_2$$ where $$V_1$$ is the number of words you want to embed and $$V_2$$ is the number of words you use to represent context (I am fairly sure you don't have to use same words for both). By factorizing the matrix you get W of size $$V_1 \times K$$ which represents a low dimension representation of all words in vocabulary (representation of i-th word is row i in the matrix). Therefore, representation of a single word here is of size K, while for BoW it is of size V (and V is much larger than K).

Page 1 of 1