Towards IID representation learning and its application on biomedical data
Jiqing Wu, Inti Zlobec, Maxime Lafarge, Yukun He, Viktor H. Koelzer

TL;DR
This paper introduces IID representation learning as a fundamental approach to improve out-of-distribution generalization in biomedical data by learning task-relevant functions that induce IID among transformed data.
Contribution
It proposes a novel IID representation learning framework and demonstrates its effectiveness on biomedical OOD tasks, outperforming state-of-the-art methods.
Findings
Superior OOD generalization performance on biomedical datasets.
Effective induction of IID representations improves robustness.
Reproducible code available for benchmarking and further research.
Abstract
Due to the heterogeneity of real-world data, the widely accepted independent and identically distributed (IID) assumption has been criticized in recent studies on causality. In this paper, we argue that instead of being a questionable assumption, IID is a fundamental task-relevant property that needs to be learned. Consider independent random vectors , we elaborate on how a variety of different causal questions can be reformulated to learning a task-relevant function that induces IID among , which we term IID representation learning. For proof of concept, we examine the IID representation learning on Out-of-Distribution (OOD) generalization tasks. Concretely, by utilizing the representation obtained via the learned function that induces IID, we conduct prediction of molecular characteristics (molecular…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Health, Environment, Cognitive Aging · Bayesian Modeling and Causal Inference
