The Power of Factorial Powers: New Parameter settings for (Stochastic)   Optimization

Aaron Defazio; Robert M. Gower

arXiv:2006.01244·cs.LG·April 13, 2023

The Power of Factorial Powers: New Parameter settings for (Stochastic) Optimization

Aaron Defazio, Robert M. Gower

PDF

Open Access

TL;DR

This paper introduces factorial powers as a versatile tool for setting constants in convergence proofs, improving and simplifying convergence rate analyses for various optimization algorithms.

Contribution

It proposes using factorial powers to define constants in convergence proofs, enhancing the analysis of momentum, accelerated gradient, and SVRG methods.

Findings

01

Factorial powers have useful mathematical properties.

02

Applying factorial powers simplifies convergence proofs.

03

Improves convergence rate bounds for several optimization algorithms.

Abstract

The convergence rates for convex and non-convex optimization methods depend on the choice of a host of constants, including step sizes, Lyapunov function constants and momentum constants. In this work we propose the use of factorial powers as a flexible tool for defining constants that appear in convergence proofs. We list a number of remarkable properties that these sequences enjoy, and show how they can be applied to convergence proofs to simplify or improve the convergence rates of the momentum method, accelerated gradient and the stochastic variance reduced method (SVRG).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Markov Chains and Monte Carlo Methods