Learning Stable Predictors from Weak Supervision under Distribution Shift
Mehrdad Shoeibi, Elias Hossain, Ivan Garibay, Niloofar Yousefi

TL;DR
This paper investigates the robustness of weak supervision methods under distribution shifts, especially temporal changes, using CRISPR-Cas13d transcriptomic data across different cell lines and timepoints.
Contribution
It formalizes supervision drift as changes in $P(y \,|\, x, c)$ across contexts and provides a benchmark demonstrating the limitations of weak supervision under temporal shifts.
Findings
Weak supervision supports in-domain learning and partial transfer across cell lines.
Temporal transfer under supervision drift collapses, leading to poor performance.
Feature importance stability varies across time, indicating supervision drift causes transfer failures.
Abstract
Learning from weak, proxy, or relative supervision is common when ground-truth labels are unavailable, but robustness under distribution shift remains poorly understood because the supervision mechanism itself may change across environments. We formalize this phenomenon as supervision drift, defined as changes in across contexts, and study it in CRISPR-Cas13d transcriptomic perturbation experiments where guide efficacy is inferred indirectly from RNA-seq responses. Using publicly available data spanning two human cell lines and multiple post-induction timepoints, we construct a controlled non-IID benchmark with explicit domain (cell line) and temporal shifts, while reusing a fixed weak-label construction across all contexts to avoid changing targets. Across linear and tree-based models, weak supervision supports meaningful learning in-domain (ridge ,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
