Hi, I do not quite get how the solutions to these two quesion are not contradicting each other. How can the computational cost of full gradient descent increase when observed entries increase, while ALS does not? Even though both depend only on ND or KND. Thank you!!
The full gradient in Q24 is a sum over all observed entries, so it clearly gets more expensive as the number of entries grows.
The stochastic gradient in Q24 samples one entry at the time. While the algorithm may need more steps to converge, each step remains the same cost.
While the ALS method gets more expensive end-to-end, the matrices that need to be inverted in this algorithm are always of shape \(D \times D\), so the inversion itself remains the same cost. Building those matrices may get more costly.
Exam 2019 Q24 & 25
Hi, I do not quite get how the solutions to these two quesion are not contradicting each other. How can the computational cost of full gradient descent increase when observed entries increase, while ALS does not? Even though both depend only on ND or KND. Thank you!!
Hi,
Let's see:
Hope this helps!
1
Add comment