TL;DR
This paper introduces stabilized regression, a method that enforces predictor stability across different environments to improve generalization, with applications in systems biology and a theoretical link to causal models.
Contribution
It proposes stabilized regression that explicitly enforces stability, introduces the concept of a stable blanket, and connects multi-environment regression to causal inference.
Findings
Stabilized regression improves prediction in unseen environments.
The stable blanket is an optimal predictor subset for generalization.
Theoretical link between multi-environment regression and causal models.
Abstract
We consider regression in which one predicts a response with a set of predictors across different experiments or environments. This is a common setup in many data-driven scientific fields and we argue that statistical inference can benefit from an analysis that takes into account the distributional changes across environments. In particular, it is useful to distinguish between stable and unstable predictors, i.e., predictors which have a fixed or a changing functional dependence on the response, respectively. We introduce stabilized regression which explicitly enforces stability and thus improves generalization performance to previously unseen environments. Our work is motivated by an application in systems biology. Using multiomic data, we demonstrate how hypothesis generation about gene function can benefit from stabilized regression. We believe that a similar line of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
