Statistical inference for case-control logistic regression via integrating external summary data
Hengchao Shi, Xinyi Liu, Ming Zheng, Wen Yu

TL;DR
This paper introduces an empirical likelihood method that combines internal case-control data with external summary data to enable consistent estimation of all logistic regression parameters, including the intercept and case proportion.
Contribution
The paper proposes a novel empirical likelihood approach that incorporates external summary data to identify and estimate all parameters in case-control logistic regression models.
Findings
Intercept becomes identifiable with external data.
All parameters are estimated consistently.
Estimators are asymptotically normal.
Abstract
Case-control sampling is a commonly used retrospective sampling design to alleviate imbalanced structure of binary data. When fitting the logistic regression model with case-control data, although the slope parameter of the model can be consistently estimated, the intercept parameter is not identifiable, and the marginal case proportion is not estimatable, either. We consider the situations in which besides the case-control data from the main study, called internal study, there also exists summary-level information from related external studies. An empirical likelihood based approach is proposed to make inference for the logistic model by incorporating the internal case-control data and external information. We show that the intercept parameter is identifiable with the help of external information, and then all the regression parameters as well as the marginal case proportion can be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference
MethodsSparse Evolutionary Training · Logistic Regression
