Basic Inequalities for First-Order Optimization with Applications to Statistical Risk Analysis

Seunghoon Paik; Kangjie Zhou; Matus Telgarsky; Ryan J. Tibshirani

arXiv:2512.24999·math.ST·January 1, 2026

Basic Inequalities for First-Order Optimization with Applications to Statistical Risk Analysis

Seunghoon Paik, Kangjie Zhou, Matus Telgarsky, Ryan J. Tibshirani

PDF

Open Access

TL;DR

This paper develops basic inequalities for first-order optimization algorithms, providing a unified framework that links optimization steps to statistical risk, and applies it to analyze various algorithms and models.

Contribution

It introduces a versatile inequality framework that connects optimization dynamics with statistical regularization, offering new insights and refinements for multiple algorithms.

Findings

01

Refined analysis of gradient descent dynamics.

02

New results for mirror descent and exponentiated gradient.

03

Experimental validation on generalized linear models.

Abstract

We introduce \textit{basic inequalities} for first-order iterative optimization algorithms, forming a simple and versatile framework that connects implicit and explicit regularization. While related inequalities appear in the literature, we isolate and highlight a specific form and develop it as a well-rounded tool for statistical analysis. Let $f$ denote the objective function to be optimized. Given a first-order iterative algorithm initialized at $θ_{0}$ with current iterate $θ_{T}$ , the basic inequality upper bounds $f (θ_{T}) - f (z)$ for any reference point $z$ in terms of the accumulated step sizes and the distances between $θ_{0}$ , $θ_{T}$ , and $z$ . The bound translates the number of iterations into an effective regularization coefficient in the loss function. We demonstrate this framework through analyses of training dynamics and prediction risk bounds. In addition to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Gaussian Processes and Bayesian Inference · Sparse and Compressive Sensing Techniques