On the sample complexity of parameter estimation in logistic regression with normal design
Daniel Hsu, Arya Mazumdar

TL;DR
This paper analyzes the number of samples needed to accurately estimate parameters in logistic regression with normal features, revealing phase transitions in sample complexity across different temperature regimes.
Contribution
It provides the first non-asymptotic analysis of sample complexity dependence on error and inverse temperature in logistic regression with normal design.
Findings
Sample complexity exhibits two change-points in inverse temperature.
Different temperature regimes require different sample sizes for accurate estimation.
The analysis clarifies the relationship between temperature, error, and sample size.
Abstract
The logistic regression model is one of the most popular data generation model in noisy binary classification problems. In this work, we study the sample complexity of estimating the parameters of the logistic regression model up to a given error, in terms of the dimension and the inverse temperature, with standard normal covariates. The inverse temperature controls the signal-to-noise ratio of the data generation process. While both generalization bounds and asymptotic performance of the maximum-likelihood estimator for logistic regression are well-studied, the non-asymptotic sample complexity that shows the dependence on error and the inverse temperature for parameter estimation is absent from previous analyses. We show that the sample complexity curve has two change-points in terms of the inverse temperature, clearly separating the low, moderate, and high temperature regimes.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Bayesian Methods and Mixture Models · Advanced Statistical Methods and Models
MethodsLogistic Regression
