No Foundations without Foundations -- Why semi-mechanistic models are essential for regulatory biology
Luka Kova\v{c}evi\'c, Thomas Gaudelet, James Opzoomer, Hagen Triendl,, John Whittaker, Caroline Uhler, Lindsay Edwards, Jake P. Taylor-King

TL;DR
This paper advocates for semi-mechanistic models that integrate mechanistic insights with experimental design to advance predictive understanding in regulatory biology, emphasizing the importance of first-principles approaches over purely data-driven methods.
Contribution
The authors introduce a semi-mechanistic framework unifying perturbation experiments across various systems, linking it to existing models and demonstrating improved predictive performance.
Findings
Modified loss function enhances prediction accuracy
Error analysis informs batching strategies
Framework clarifies assumptions in machine learning methods
Abstract
Despite substantial efforts, deep learning has not yet delivered a transformative impact on elucidating regulatory biology, particularly in the realm of predicting gene expression profiles. Here, we argue that genuine "foundation models" of regulatory biology will remain out of reach unless guided by frameworks that integrate mechanistic insight with principled experimental design. We present one such ground-up, semi-mechanistic framework that unifies perturbation-based experimental designs across both in vitro and in vivo CRISPR screens, accounting for differentiating and non-differentiating cellular systems. By revealing previously unrecognised assumptions in published machine learning methods, our approach clarifies links with popular techniques such as variational autoencoders and structural causal models. In practice, this framework suggests a modified loss function that we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene Regulatory Network Analysis
