Stochastic Dual Coordinate Ascent Methods for Regularized Loss Minimization
Shai Shalev-Shwartz, Tong Zhang

TL;DR
This paper provides a new theoretical analysis of Stochastic Dual Coordinate Ascent (SDCA), demonstrating its strong convergence guarantees for large-scale regularized loss minimization, comparable or superior to Stochastic Gradient Descent (SGD).
Contribution
The paper introduces a novel convergence analysis for SDCA, establishing its theoretical advantages and practical effectiveness in supervised machine learning tasks.
Findings
SDCA has strong convergence guarantees similar or better than SGD.
The analysis justifies SDCA's effectiveness in large-scale applications.
SDCA is theoretically sound for regularized loss minimization.
Abstract
Stochastic Gradient Descent (SGD) has become popular for solving large scale supervised machine learning optimization problems such as SVM, due to their strong theoretical guarantees. While the closely related Dual Coordinate Ascent (DCA) method has been implemented in various software packages, it has so far lacked good convergence analysis. This paper presents a new analysis of Stochastic Dual Coordinate Ascent (SDCA) showing that this class of methods enjoy strong theoretical guarantees that are comparable or better than SGD. This analysis justifies the effectiveness of SDCA for practical applications.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Statistical Methods and Inference · Sparse and Compressive Sensing Techniques
