On the Complexity of Finding Small Subgradients in Nonsmooth Optimization
Guy Kornowski, Ohad Shamir

TL;DR
This paper investigates the complexity of finding approximate stationary points in nonsmooth optimization, revealing limitations of deterministic algorithms, derandomization possibilities for smooth functions, and lower bounds applicable to all algorithms.
Contribution
It provides new lower bounds, analyzes derandomization for smooth functions, and compares complexities between deterministic and randomized approaches in nonsmooth optimization.
Findings
Deterministic algorithms cannot achieve dimension-free rates.
Randomized algorithms can find stationary points efficiently.
Lower bounds apply universally, regardless of convexity.
Abstract
We study the oracle complexity of producing -stationary points of Lipschitz functions, in the sense proposed by Zhang et al. [2020]. While there exist dimension-free randomized algorithms for producing such points within first-order oracle calls, we show that no dimension-free rate can be achieved by a deterministic algorithm. On the other hand, we point out that this rate can be derandomized for smooth functions with merely a logarithmic dependence on the smoothness parameter. Moreover, we establish several lower bounds for this task which hold for any randomized algorithm, with or without convexity. Finally, we show how the convergence rate of finding -stationary points can be improved in case the function is convex, a setting which we motivate by proving that in general no finite time algorithm can produce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Complexity and Algorithms in Graphs · Stochastic Gradient Optimization Techniques
