Exam 2019 Q24 & 25

This forum is inactive. Browsing/searching possible.

Connect your moderator Slack workspace to receive post notifications:

Hi, I do not quite get how the solutions to these two quesion are not contradicting each other. How can the computational cost of full gradient descent increase when observed entries increase, while ALS does not? Even though both depend only on ND or KND. Thank you!!

Screenshot 2022-01-13 at 12.51.07.jpg

13 Jan '22 ·

anonymous

Top comment

Hi,

Let's see:

The full gradient in Q24 is a sum over all observed entries, so it clearly gets more expensive as the number of entries grows.
The stochastic gradient in Q24 samples one entry at the time. While the algorithm may need more steps to converge, each step remains the same cost.
While the ALS method gets more expensive end-to-end, the matrices that need to be inverted in this algorithm are always of shape \(D \times D\), so the inversion itself remains the same cost. Building those matrices may get more costly.

Hope this helps!

1

13 Jan '22 · 1 ·

Thijs Vogels admin

Page 1 of 1

Add comment

How to style: strictly use the or click here. E.g., \(\alpha + \beta\) gives (inline) \(\alpha + \beta\). No \(\LaTeX\) preview (yet).