Connect your moderator Slack workspace to receive post notifications:
Sign in with Slack

Computational cost

If the cost to compute \( \mathbf{e}\) is

$$ O(N*D) + O(N) $$

and the cost to compute \(\mathbf{X}^{\top} \mathbf{e}\) is also

$$ O(N*D) $$

Then wouldn't the complexity of computing the gradient starting from w (with e not given) be

$$ O({N}^{2} * {D}^{2}) $$

Computing the gradient (of MSE in least squares regression), without \(\mathbf{e}\) given, is done in a sequential manner:

  • compute error \(\mathbf{e}\), with complexity \(O(N*D) + O(N)\)
  • compute gradient \(\sim X^T \mathbf{e}\), with complexity \(O(N*D)\)

so the complexities need to be summed (not multiplied):

$$O(N*D) + O(N) + O(N*D) = O(N*D) + O(N*D) = O(N*D)$$

Can you please explain why O(N∗D)+O(N)+O(N∗D)=O(N∗D)+O(N∗D)=O(N∗D)
simplifies to O(N∗D) ?

The missing step is O(3ND)
And then we can simplify the constant (3)

Page 1 of 1

Add comment

Post as Anonymous Dont send out notification