Hello,
I was going over the word2vec explanation and I am a bit confused about what exactly is being trained when we us w_d as classifying vector in the video for lecture 13 part 2 (around 36:00). When we use sgd do we update the weights of the w_d vector and not w_n vector. When we train we choose one word close and one word far away so is it like a two step sgd iteration?
Also, What do we use as initialized k dimensional vectors for all the words?
word2vec
Hello,
I was going over the word2vec explanation and I am a bit confused about what exactly is being trained when we us w_d as classifying vector in the video for lecture 13 part 2 (around 36:00). When we use sgd do we update the weights of the w_d vector and not w_n vector. When we train we choose one word close and one word far away so is it like a two step sgd iteration?
Also, What do we use as initialized k dimensional vectors for all the words?
thank you for your help
3
Add comment