# First-order algorithms converge faster than $O(1/k)$ on convex problems

**Authors:** Ching-pei Lee, Stephen J. Wright

arXiv: 1812.08485 · 2019-05-15

## TL;DR

This paper proves that first-order algorithms like gradient descent and coordinate descent can achieve convergence rates faster than $O(1/k)$ for convex problems, improving known bounds.

## Contribution

It establishes that several first-order methods attain an $o(1/k)$ convergence rate, surpassing the traditional $O(1/k)$ rate, and shows this is the best possible improvement.

## Key findings

- Gradient descent achieves $o(1/k)$ convergence rate.
- Proximal methods also attain $o(1/k)$ rate.
- The $o(1/k)$ rate is tight and cannot be improved to $O(1/k^{1+	heta})$ for any $	heta>0$. 

## Abstract

It is well known that both gradient descent and stochastic coordinate descent achieve a global convergence rate of $O(1/k)$ in the objective value, when applied to a scheme for minimizing a Lipschitz-continuously differentiable, unconstrained convex function. In this work, we improve this rate to $o(1/k)$. We extend the result to proximal gradient and proximal coordinate descent on regularized problems to show similar $o(1/k)$ convergence rates. The result is tight in the sense that a rate of $O(1/k^{1+\epsilon})$ is not generally attainable for any $\epsilon>0$, for any of these methods.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1812.08485/full.md

## References

15 references — full list in the complete paper: https://tomesphere.com/paper/1812.08485/full.md

---
Source: https://tomesphere.com/paper/1812.08485