Question 18

This forum is inactive. Browsing/searching possible.

Connect your moderator Slack workspace to receive post notifications:

In exam 2016, question 18, I don't understand why "Comparing the word feature representations from bag-of-words vs GloVe, bag-of-words typically gives lower dimensional representations." is not a correct answer?

In GloVe, dim(W) = DxK
In bag-of-words, dim(W) = 1xK

So I would have conclude that it is true...

11 Jan '21 ·

anonymous

bag of words is not 1xK but a one hot encoding so the size is the size of the vocabulary

2

12 Jan '21 ·

anonymous

Ok, thank you!

12 Jan '21 ·

anonymous

But i thought that in glove we are also dealing with the co occurrence matrix which has a dimension of V x V (where V is the size of the vocabulary), is this not correct?

12 Jan '21 ·

anonymous

Well, technically co occurence matrix is of size \(V_1 \times V_2\) where \(V_1\) is the number of words you want to embed and \(V_2\) is the number of words you use to represent context (I am fairly sure you don't have to use same words for both). By factorizing the matrix you get W of size \(V_1 \times K\) which represents a low dimension representation of all words in vocabulary (representation of i-th word is row i in the matrix). Therefore, representation of a single word here is of size K, while for BoW it is of size V (and V is much larger than K).

1

12 Jan '21 ·

anonymous

Page 1 of 1

Add comment

How to style: strictly use the or click here. E.g., \(\alpha + \beta\) gives (inline) \(\alpha + \beta\). No \(\LaTeX\) preview (yet).