Computational cost for projected gradient descent

What is the Computational cost for projected gradient descent?

If you look at page 16 of the slides of week 2, you'll see that it depends on the geometry of the set you are projecting on. I was talking about this with some friends of mine yesterday and we settled that the complexity of the "normal" part of the GD would be the same as usual. However, depending on the boundary of your set, it might be quite difficult/expensive to projet onto and so the cost is highly dependent on this. At best, you just have to project your weight onto a line/hyperplane as in the drawing and here the cost is simply that of a dot product (remember projection is basically equivalent to a dot product). However, if you have lots of potential candidate hyperplanes or some non-planar surface (imagine an ellipsoid or some weird hyperbolic shape), then it might get much more complex.

Hope this helps!


Page 1 of 1

Add comment

Post as Anonymous Dont send out notification