Globally Convergent Newton Methods for Ill-conditioned Generalized Self-concordant Losses
Ulysse Marteau-Ferey (SIERRA, DI-ENS, PSL), Francis Bach (SIERRA,, DI-ENS, PSL), Alessandro Rudi (SIERRA, DI-ENS, PSL)

TL;DR
This paper introduces globally convergent Newton methods for ill-conditioned generalized self-concordant losses, achieving optimal complexity and generalization bounds in large-scale convex optimization, especially for logistic and softmax regressions.
Contribution
It proposes a new Newton-based scheme with proven global convergence and improved behavior in ill-conditioned problems, extending to non-parametric settings with theoretical guarantees.
Findings
Algorithm achieves linear convergence with logarithmic dependence on condition number.
Provides an explicit non-parametric algorithm with optimal complexity and no dependence on condition number.
First large-scale method with theoretical guarantees for logistic and softmax regression in ill-conditioned settings.
Abstract
In this paper, we study large-scale convex optimization algorithms based on the Newton method applied to regularized generalized self-concordant losses, which include logistic regression and softmax regression. We first prove that our new simple scheme based on a sequence of problems with decreasing regularization parameters is provably globally convergent, that this convergence is linear with a constant factor which scales only logarithmically with the condition number. In the parametric setting, we obtain an algorithm with the same scaling than regular first-order methods but with an improved behavior, in particular in ill-conditioned problems. Second, in the non parametric machine learning setting, we provide an explicit algorithm combining the previous scheme with Nystr{\"o}m projection techniques, and prove that it achieves optimal generalization bounds with a time complexity of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Statistical Methods and Inference · Sparse and Compressive Sensing Techniques
MethodsLogistic Regression · Softmax
