Risk and parameter convergence of logistic regression
Ziwei Ji, Matus Telgarsky

TL;DR
This paper analyzes the convergence behavior of gradient descent in logistic regression, revealing how iterates approach a maximum margin predictor and the associated risk offset with specific rates.
Contribution
It provides a detailed theoretical analysis of the convergence rates and geometric behavior of gradient descent in logistic regression, including the bias towards a maximum margin predictor.
Findings
Gradient descent iterates follow a unique ray defined by the data.
The direction of convergence is the maximum margin predictor.
The offset converges to the global optimum at a specific rate.
Abstract
Gradient descent, when applied to the task of logistic regression, outputs iterates which are biased to follow a unique ray defined by the data. The direction of this ray is the maximum margin predictor of a maximal linearly separable subset of the data; the gradient descent iterates converge to this ray in direction at the rate . The ray does not pass through the origin in general, and its offset is the bounded global optimum of the risk over the remaining data; gradient descent recovers this offset at a rate .
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Statistical Methods and Inference
