Risk and parameter convergence of logistic regression

Ziwei Ji; Matus Telgarsky

arXiv:1803.07300·cs.LG·June 11, 2019·79 cites

Risk and parameter convergence of logistic regression

Ziwei Ji, Matus Telgarsky

PDF

Open Access

TL;DR

This paper analyzes the convergence behavior of gradient descent in logistic regression, revealing how iterates approach a maximum margin predictor and the associated risk offset with specific rates.

Contribution

It provides a detailed theoretical analysis of the convergence rates and geometric behavior of gradient descent in logistic regression, including the bias towards a maximum margin predictor.

Findings

01

Gradient descent iterates follow a unique ray defined by the data.

02

The direction of convergence is the maximum margin predictor.

03

The offset converges to the global optimum at a specific rate.

Abstract

Gradient descent, when applied to the task of logistic regression, outputs iterates which are biased to follow a unique ray defined by the data. The direction of this ray is the maximum margin predictor of a maximal linearly separable subset of the data; the gradient descent iterates converge to this ray in direction at the rate $O (ln ln t / ln t)$ . The ray does not pass through the origin in general, and its offset is the bounded global optimum of the risk over the remaining data; gradient descent recovers this offset at a rate $O ((ln t)^{2} / t)$ .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Statistical Methods and Inference