Regularized Risk Minimization by Nesterov's Accelerated Gradient   Methods: Algorithmic Extensions and Empirical Studies

Xinhua Zhang; Ankan Saha; S.V.N. Vishwanathan

arXiv:1011.0472·cs.LG·November 3, 2010·5 cites

Regularized Risk Minimization by Nesterov's Accelerated Gradient Methods: Algorithmic Extensions and Empirical Studies

Xinhua Zhang, Ankan Saha, S.V.N. Vishwanathan

PDF

Open Access

TL;DR

This paper extends Nesterov's accelerated gradient methods to handle strongly convex and composite functions, providing a unified framework that improves convergence and empirical performance on max-margin models.

Contribution

The authors develop a unifying AGM framework with adaptive Lipschitz tuning and duality gap bounds, enhancing its applicability and efficiency for machine learning tasks.

Findings

01

AGM outperforms state-of-the-art solvers on max-margin models

02

Framework covers both $ ext{infinity}$-memory and 1-memory AGM styles

03

Enhanced convergence rates and efficient gradient computations

Abstract

Nesterov's accelerated gradient methods (AGM) have been successfully applied in many machine learning areas. However, their empirical performance on training max-margin models has been inferior to existing specialized solvers. In this paper, we first extend AGM to strongly convex and composite objective functions with Bregman style prox-functions. Our unifying framework covers both the $\infty$ -memory and 1-memory styles of AGM, tunes the Lipschiz constant adaptively, and bounds the duality gap. Then we demonstrate various ways to apply this framework of methods to a wide range of machine learning problems. Emphasis will be given on their rate of convergence and how to efficiently compute the gradient and optimize the models. The experimental results show that with our extensions AGM outperforms state-of-the-art solvers on max-margin models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Machine Learning and ELM