Proximal Stochastic Newton-type Gradient Descent Methods for Minimizing Regularized Finite Sums
Ziqiang Shi

TL;DR
This paper introduces PROXTONE, a proximal stochastic Newton-type gradient method that leverages second order information to achieve linear convergence in both objective value and solutions for optimizing regularized finite sums.
Contribution
It unifies previous proximal stochastic gradient methods and incorporates second order information to improve convergence rates and solution accuracy.
Findings
Achieves linear convergence in objective function value.
Achieves linear convergence in the solution.
Provides a simple and intuitive proof technique.
Abstract
In this work, we generalized and unified recent two completely different works of Jascha \cite{sohl2014fast} and Lee \cite{lee2012proximal} respectively into one by proposing the \textbf{prox}imal s\textbf{to}chastic \textbf{N}ewton-type gradient (PROXTONE) method for optimizing the sums of two convex functions: one is the average of a huge number of smooth convex functions, and the other is a non-smooth convex function. While a set of recently proposed proximal stochastic gradient methods, include MISO, Prox-SDCA, Prox-SVRG, and SAG, converge at linear rates, the PROXTONE incorporates second order information to obtain stronger convergence results, that it achieves a linear convergence rate not only in the value of the objective function, but also in the \emph{solution}. The proof is simple and intuitive, and the results and technique can be served as a initiate for the research on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Advanced Bandit Algorithms Research
