Stochastic Dual Coordinate Ascent Methods for Regularized Loss   Minimization

Shai Shalev-Shwartz; Tong Zhang

arXiv:1209.1873·stat.ML·March 20, 2015·254 cites

Stochastic Dual Coordinate Ascent Methods for Regularized Loss Minimization

Shai Shalev-Shwartz, Tong Zhang

PDF

Open Access

TL;DR

This paper provides a new theoretical analysis of Stochastic Dual Coordinate Ascent (SDCA), demonstrating its strong convergence guarantees for large-scale regularized loss minimization, comparable or superior to Stochastic Gradient Descent (SGD).

Contribution

The paper introduces a novel convergence analysis for SDCA, establishing its theoretical advantages and practical effectiveness in supervised machine learning tasks.

Findings

01

SDCA has strong convergence guarantees similar or better than SGD.

02

The analysis justifies SDCA's effectiveness in large-scale applications.

03

SDCA is theoretically sound for regularized loss minimization.

Abstract

Stochastic Gradient Descent (SGD) has become popular for solving large scale supervised machine learning optimization problems such as SVM, due to their strong theoretical guarantees. While the closely related Dual Coordinate Ascent (DCA) method has been implemented in various software packages, it has so far lacked good convergence analysis. This paper presents a new analysis of Stochastic Dual Coordinate Ascent (SDCA) showing that this class of methods enjoy strong theoretical guarantees that are comparable or better than SGD. This analysis justifies the effectiveness of SDCA for practical applications.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Statistical Methods and Inference · Sparse and Compressive Sensing Techniques