Randomization Does Not Justify Logistic Regression
David A. Freedman

TL;DR
This paper challenges the common assumption that randomization justifies logistic regression, proposing a consistent estimator and highlighting limitations of traditional methods in experimental analysis.
Contribution
It introduces a new consistent estimator for logistic regression in randomized experiments, addressing the inconsistency of usual estimators under Neyman's non-parametric framework.
Findings
Traditional logistic regression estimators can be inconsistent under randomization.
The proposed estimator is consistent within Neyman's potential outcomes framework.
Simulation results demonstrate the estimator's improved performance.
Abstract
The logit model is often used to analyze experimental data. However, randomization does not justify the model, so the usual estimators can be inconsistent. A consistent estimator is proposed. Neyman's non-parametric setup is used as a benchmark. In this setup, each subject has two potential responses, one if treated and the other if untreated; only one of the two responses can be observed. Beside the mathematics, there are simulation results, a brief review of the literature, and some recommendations for practice.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
