An Exponentially Increasing Step-size for Parameter Estimation in   Statistical Models

Nhat Ho; Tongzheng Ren; Sujay Sanghavi; Purnamrita Sarkar and; Rachel Ward

arXiv:2205.07999·stat.ML·February 3, 2023

An Exponentially Increasing Step-size for Parameter Estimation in Statistical Models

Nhat Ho, Tongzheng Ren, Sujay Sanghavi, Purnamrita Sarkar and, Rachel Ward

PDF

Open Access

TL;DR

This paper introduces an exponential step-size gradient descent method that accelerates parameter estimation in statistical models, especially non-regular ones, achieving optimal complexity and resolving a key computational gap.

Contribution

The paper proposes an exponential step-size gradient descent algorithm that converges faster in non-regular statistical models, bridging the gap between statistical and computational complexities.

Findings

01

EGD converges linearly to the optimal solution under certain conditions.

02

EGD reaches the statistical radius within a logarithmic number of iterations in non-regular models.

03

EGD is exponentially more efficient than standard GD in non-regular statistical settings.

Abstract

Using gradient descent (GD) with fixed or decaying step-size is a standard practice in unconstrained optimization problems. However, when the loss function is only locally convex, such a step-size schedule artificially slows GD down as it cannot explore the flat curvature of the loss function. To overcome that issue, we propose to exponentially increase the step-size of the GD algorithm. Under homogeneous assumptions on the loss function, we demonstrate that the iterates of the proposed \emph{exponential step size gradient descent} (EGD) algorithm converge linearly to the optimal solution. Leveraging that optimization insight, we then consider using the EGD algorithm for solving parameter estimation under both regular and non-regular statistical models whose loss function becomes locally convex when the sample size goes to infinity. We demonstrate that the EGD iterates reach the final…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Statistical Methods and Inference · Markov Chains and Monte Carlo Methods