TL;DR
This paper introduces a structured formalism and a multi-agent framework for autonomous mechanistic reasoning in virtual cells, enhancing biological explanation accuracy and supporting scientific discovery.
Contribution
It presents VCR-Agent, a novel multi-agent system that generates and verifies mechanistic explanations, and releases the VC-TRACES dataset for virtual cell reasoning.
Findings
Training with verified explanations improves gene expression prediction.
The framework enhances factual precision in biological reasoning.
The VC-TRACES dataset provides verified mechanistic explanations.
Abstract
Large language models (LLMs) have recently gained significant attention as a promising approach to accelerate scientific discovery. However, their application in open-ended scientific domains such as biology remains limited, primarily due to the lack of factually grounded and actionable explanations. To address this, we introduce a structured explanation formalism for virtual cells that represents biological reasoning as mechanistic action graphs, enabling systematic verification and falsification. Building upon this, we propose VCR-Agent, a multi-agent framework that integrates biologically grounded knowledge retrieval with a verifier-based filtering approach to generate and validate mechanistic reasoning autonomously. Using this framework, we release VC-TRACES dataset, which consists of verified mechanistic explanations derived from the Tahoe-100M atlas. Empirically, we demonstrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
