Finite-sample performance of the maximum likelihood estimator in logistic regression
Hugo Chardon, Matthieu Lerasle, Jaouad Mourtada

TL;DR
This paper provides non-asymptotic guarantees for the existence and accuracy of the maximum likelihood estimator in logistic regression, considering various covariate distributions and model specifications.
Contribution
It extends sharp finite-sample analysis of the MLE in logistic regression to non-Gaussian covariates and misspecified models, including Bernoulli designs.
Findings
Sharp non-asymptotic guarantees for Gaussian covariates
Extension to non-Gaussian covariates under margin conditions
Analysis of MLE behavior in Bernoulli design cases
Abstract
Logistic regression is a classical model for describing the probabilistic dependence of binary responses to multivariate covariates. We consider the predictive performance of the maximum likelihood estimator (MLE) for logistic regression, assessed in terms of logistic risk. We consider two questions: first, that of the existence of the MLE (which occurs when the dataset is not linearly separated), and second, that of its accuracy when it exists. These properties depend on both the dimension of covariates and the signal strength. In the case of Gaussian covariates and a well-specified logistic model, we obtain sharp non-asymptotic guarantees for the existence and excess logistic risk of the MLE. We then generalize these results in two ways: first, to non-Gaussian covariates satisfying a certain two-dimensional margin condition, and second to the general case of statistical learning with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Advanced Statistical Process Monitoring · Statistical Methods and Inference
