theory question complexity

This forum is inactive. Browsing/searching possible.

Connect your moderator Slack workspace to receive post notifications:

theory question complexity

Hello,

I understand that we consider D different Wi for each dimension but how is the x N obtained? If it is number of samples i'm not sure why that is true because for every W combination there isn't just a 1 step process to evaluate loss.

thanks for your help

30 Dec '20 ·

anonymous

Hi, the assumption is calculating the loss for a single value is O(1), so doing it N times is O(N) complexity. O(N) is important as SGD does not require O(N) complexity for each step (whereas grid search does).
Best,
Semih

1

30 Dec '20 ·

anonymous

thank you for your quick response, I understood that was the assumption but I am not sure why we can say that calculating the loss has complexity O(1)

1

30 Dec '20 ·

anonymous

It is more of a matter of choice. Here we choose to say it has the complexity of O(N) instead of O(1). We chose O(N) to stress that some algorithms do not look at the whole dataset to make a single decision (like SGD). I think an example would make it clearer. Let's say we instead make a new algorithm, which is a batched grid search, which samples B examples randomly from the training set, and performs the evaluation of a single grid location. If we were to assume loss calculation has O(1) complexity, then we cannot really show batched grid search is faster than grid search. However, we can show the complexity reduction if we assume calculating loss has O(N) complexity, then batched one would have O(B) complexity.
Here we choose O(N) as we will compare grid-search with SGD later and SGD indeed does batched computation.

2

31 Dec '20 ·

anonymous

Page 1 of 1

Add comment

How to style: strictly use the or click here. E.g., \(\alpha + \beta\) gives (inline) \(\alpha + \beta\). No \(\LaTeX\) preview (yet).