Doubly Flexible Estimation under Label Shift

Seong-ho Lee; Yanyuan Ma; Jiwei Zhao

arXiv:2307.04250·stat.ME·July 11, 2023

Doubly Flexible Estimation under Label Shift

Seong-ho Lee, Yanyuan Ma, Jiwei Zhao

PDF

Open Access

TL;DR

This paper introduces a doubly flexible estimation method for label shift scenarios, allowing for misspecification of models and density ratios, and providing robust inference when only partial data is available from the target population.

Contribution

It proposes a novel estimation procedure that is doubly flexible, accommodating misspecifications in both the regression model and density ratio without requiring their direct estimation.

Findings

01

The estimator is theoretically justified with large sample properties.

02

Simulation studies demonstrate robustness under model misspecification.

03

Application to MIMIC-III data shows practical effectiveness.

Abstract

In studies ranging from clinical medicine to policy research, complete data are usually available from a population $P$ , but the quantity of interest is often sought for a related but different population $Q$ which only has partial data. In this paper, we consider the setting that both outcome $Y$ and covariate $X$ are available from $P$ whereas only $X$ is available from $Q$ , under the so-called label shift assumption, i.e., the conditional distribution of $X$ given $Y$ remains the same across the two populations. To estimate the parameter of interest in $Q$ via leveraging the information from $P$ , the following three ingredients are essential: (a) the common conditional distribution of $X$ given $Y$ , (b) the regression model of $Y$ given $X$ in $P$ , and (c) the density ratio of $Y$ …

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Statistical Methods and Bayesian Inference · Advanced Causal Inference Techniques