A semiparametric efficient estimator in case-control studies
Yanyuan Ma

TL;DR
This paper develops a semiparametric efficient estimator for case-control studies assuming gene-environment independence, accommodating various gene distributions while leaving the environment distribution unspecified.
Contribution
It introduces a novel semiparametric estimator that achieves efficiency in case-control studies with flexible gene distribution modeling and unknown environment distribution.
Findings
Estimator is proven to be efficient in a hypothetical setting.
Efficiency is demonstrated to hold in actual case-control data.
Flexible modeling of gene effects enhances applicability.
Abstract
We construct a semiparametric estimator in case-control studies where the gene and the environment are assumed to be independent. A discrete or continuous parametric distribution of the genes is assumed in the model. A discrete distribution of the genes can be used to model the mutation or presence of certain group of genes. A continuous distribution allows the distribution of the gene effects to be in a finite-dimensional parametric family and can hence be used to model the gene expression levels. We leave the distribution of the environment totally unspecified. The estimator is derived through calculating the efficiency score function in a hypothetical setting where a close approximation to the samples is random. The resulting estimator is proved to be efficient in the hypothetical situation. The efficiency of the estimator is further demonstrated to hold in the case-control setting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
