TL;DR
This paper introduces a stochastic conjugate gradient algorithm with variance reduction that achieves faster convergence for various learning models and improves computational efficiency on large-scale datasets.
Contribution
The paper presents a novel stochastic CG algorithm with variance reduction and proves its linear convergence for strongly convex functions.
Findings
Faster convergence compared to existing methods on four learning models
Comparable AUC performance to LIBLINEAR on six datasets
Significant improvement in computational efficiency
Abstract
Conjugate gradient (CG) methods are a class of important methods for solving linear equations and nonlinear optimization problems. In this paper, we propose a new stochastic CG algorithm with variance reduction and we prove its linear convergence with the Fletcher and Reeves method for strongly convex and smooth functions. We experimentally demonstrate that the CG with variance reduction algorithm converges faster than its counterparts for four learning models, which may be convex, nonconvex or nonsmooth. In addition, its area under the curve performance on six large-scale data sets is comparable to that of the LIBLINEAR solver for the L2-regularized L2-loss but with a significant improvement in computational efficiency
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
