Parameter-free projected gradient descent

Evgenii Chzhen (LMO; CELESTE); Christophe Giraud (LMO; CELESTE),; Gilles Stoltz (LMO; CELESTE)

arXiv:2305.19605·stat.ML·June 1, 2023·1 cites

Parameter-free projected gradient descent

Evgenii Chzhen (LMO, CELESTE), Christophe Giraud (LMO, CELESTE),, Gilles Stoltz (LMO, CELESTE)

PDF

Open Access

TL;DR

This paper introduces a parameter-free adaptive projected gradient descent algorithm that achieves optimal convergence rates without additional hyperparameters, handling projections and stochastic settings effectively.

Contribution

It presents a fully parameter-free version of AdaGrad for convex optimization with projections, improving adaptivity and simplicity over existing methods.

Findings

01

Achieves optimal convergence rates up to logarithmic factors.

02

Handles projection steps without restarts or reweighing.

03

Extends to stochastic optimization with supporting experiments.

Abstract

We consider the problem of minimizing a convex function over a closed convex set, with Projected Gradient Descent (PGD). We propose a fully parameter-free version of AdaGrad, which is adaptive to the distance between the initialization and the optimum, and to the sum of the square norm of the subgradients. Our algorithm is able to handle projection steps, does not involve restarts, reweighing along the trajectory or additional gradient evaluations compared to the classical PGD. It also fulfills optimal rates of convergence for cumulative regret up to logarithmic factors. We provide an extension of our approach to stochastic optimization and conduct numerical experiments supporting the developed theory.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Markov Chains and Monte Carlo Methods

MethodsAdaGrad