Learning complexity of gradient descent and conjugate gradient   algorithms

Xianqi Jiao; Jia Liu; Zhiping Chen

arXiv:2412.13473·math.OC·December 19, 2024

Learning complexity of gradient descent and conjugate gradient algorithms

Xianqi Jiao, Jia Liu, Zhiping Chen

PDF

Open Access 1 Video

TL;DR

This paper models the complexity of gradient descent and conjugate gradient algorithms as a statistical learning problem, providing bounds on their learnability and demonstrating the potential for algorithms to be learned from data.

Contribution

It introduces a new cost measure for optimization algorithms, derives bounds on the pseudo-dimension, and extends the analysis from GD to CG algorithms, enabling probabilistic identification of optimal algorithms.

Findings

01

Derived an upper bound for the pseudo-dimension of GD algorithms.

02

Extended the analysis to conjugate gradient algorithms for the first time.

03

Proved the existence of a learning algorithm to identify optimal algorithms with sufficient data.

Abstract

Gradient Descent (GD) and Conjugate Gradient (CG) methods are among the most effective iterative algorithms for solving unconstrained optimization problems, particularly in machine learning and statistical modeling, where they are employed to minimize cost functions. In these algorithms, tunable parameters, such as step sizes or conjugate parameters, play a crucial role in determining key performance metrics, like runtime and solution quality. In this work, we introduce a framework that models algorithm selection as a statistical learning problem, and thus learning complexity can be estimated by the pseudo-dimension of the algorithm group. We first propose a new cost measure for unconstrained optimization algorithms, inspired by the concept of primal-dual integral in mixed-integer linear programming. Based on the new cost measure, we derive an improved upper bound for the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Learning Complexity of Gradient Descent and Conjugate Gradient Algorithms· underline

Taxonomy

TopicsFace and Expression Recognition · Neural Networks and Applications