Stochastic Dual Coordinate Ascent with Adaptive Probabilities
Dominik Csiba, Zheng Qu, Peter Richt\'arik

TL;DR
This paper presents AdaSDCA, an adaptive stochastic dual coordinate ascent method that dynamically adjusts probabilities for improved convergence, along with a practical variant AdaSDCA+ that outperforms existing methods.
Contribution
Introduction of AdaSDCA with adaptive probability updates and AdaSDCA+ for practical, more efficient empirical risk minimization.
Findings
AdaSDCA has better theoretical complexity bounds than fixed-probability SDCA.
AdaSDCA+ outperforms existing non-adaptive methods in experiments.
Adaptive probability adjustment improves convergence speed.
Abstract
This paper introduces AdaSDCA: an adaptive variant of stochastic dual coordinate ascent (SDCA) for solving the regularized empirical risk minimization problems. Our modification consists in allowing the method adaptively change the probability distribution over the dual variables throughout the iterative process. AdaSDCA achieves provably better complexity bound than SDCA with the best fixed probability distribution, known as importance sampling. However, it is of a theoretical character as it is expensive to implement. We also propose AdaSDCA+: a practical variant which in our experiments outperforms existing non-adaptive methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Statistical Methods and Inference · Sparse and Compressive Sensing Techniques
