In exam 2016, question 18, I don't understand why "Comparing the word feature representations from bag-of-words vs GloVe, bag-of-words typically gives lower dimensional representations." is not a correct answer?
In GloVe, dim(W) = DxK
In bag-of-words, dim(W) = 1xK
So I would have conclude that it is true...
bag of words is not 1xK but a one hot encoding so the size is the size of the vocabulary
Ok, thank you!
But i thought that in glove we are also dealing with the co occurrence matrix which has a dimension of V x V (where V is the size of the vocabulary), is this not correct?
Well, technically co occurence matrix is of size \(V_1 \times V_2\) where \(V_1\) is the number of words you want to embed and \(V_2\) is the number of words you use to represent context (I am fairly sure you don't have to use same words for both). By factorizing the matrix you get W of size \(V_1 \times K\) which represents a low dimension representation of all words in vocabulary (representation of i-th word is row i in the matrix). Therefore, representation of a single word here is of size K, while for BoW it is of size V (and V is much larger than K).