Adaptive Learning with Blockwise Missing and Semi-Supervised Data
Yiming Li, Ying Wei, Molei Liu

TL;DR
This paper introduces DEFUSE, a novel method for data fusion that effectively handles blockwise missing data and semi-supervised learning challenges, improving estimation accuracy in multi-source data with distributional shifts.
Contribution
The paper proposes DEFUSE, a new adaptive estimation approach that manages blockwise missingness and semi-supervised data, with theoretical guarantees and practical improvements over existing methods.
Findings
DEFUSE achieves lower variance in estimates compared to existing methods.
The approach is robust to distributional shifts across data sources.
Empirical results demonstrate improved performance in biomedical applications.
Abstract
Data fusion enables powerful and generalizable analyses across multiple sources. However, different data collection capacities across different sources lead to blockwise missingness (BM), which poses challenges in practice. Meanwhile, the high cost of obtaining gold-standard labels leaves the majority of samples unlabeled, known as the semi-supervised (SS) problem. In this paper, we propose a novel Data-adaptive Estimation approach for data FUsion in the SEmi-supervised setting (DEFUSE) that handles both BM and SS issues in the presence of distributional shifts across data sources under a missing at random (MAR) mechanism}. DEFUSE starts with a complete-data-only estimator derived from the primary data source, and uses data-adaptive and distributional-shift-adjusted procedures to successively incorporate the data with BM covariates and the large unlabeled sample to effectively reduce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition
