Non-Uniform Stochastic Average Gradient Method for Training Conditional Random Fields
Mark Schmidt, Reza Babanezhad, Mohamed Osama Ahmed, Aaron Defazio, Ann, Clifton, Anoop Sarkar

TL;DR
This paper introduces a non-uniform stochastic average gradient method tailored for training conditional random fields, enhancing efficiency and convergence through structure-aware implementation and sampling strategies.
Contribution
It presents a practical, memory-efficient SAG algorithm with non-uniform sampling for CRFs, along with convergence analysis and empirical validation.
Findings
Significantly outperforms existing training methods in objective reduction
Performs as well or better than optimally-tuned stochastic gradient methods in test error
Improves practical training speed and convergence stability
Abstract
We apply stochastic average gradient (SAG) algorithms for training conditional random fields (CRFs). We describe a practical implementation that uses structure in the CRF gradient to reduce the memory requirement of this linearly-convergent stochastic gradient method, propose a non-uniform sampling scheme that substantially improves practical performance, and analyze the rate of convergence of the SAGA variant under non-uniform sampling. Our experimental results reveal that our method often significantly outperforms existing methods in terms of the training objective, and performs as well or better than optimally-tuned stochastic gradient methods in terms of test error.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Domain Adaptation and Few-Shot Learning · Gaussian Processes and Bayesian Inference
MethodsSAGA · Conditional Random Field
