DiscoVerse: Multi-Agent Pharmaceutical Co-Scientist for Traceable Drug Discovery and Reverse Translation
Xiaochen Zheng, Alvaro Serra, Ilya Schneider Chernov, Maddalena Marchesi, Eunice Musvasva, Tatyana Y. Doktorova

TL;DR
DiscoVerse is a multi-agent system designed to assist pharmaceutical researchers by retrieving, summarizing, and linking data from vast archives, thereby enhancing reverse translation and decision-making in drug development.
Contribution
This work introduces DiscoVerse, the first agentic framework systematically evaluated on real pharmaceutical data for reverse translation, with role-specific agents and human-in-the-loop support.
Findings
Achieved near-perfect recall (≥0.99) on benchmark queries.
Demonstrated faithful, source-linked synthesis across evidence.
Supported decision-making in real-world pharmaceutical cases.
Abstract
Pharmaceutical research and development has accumulated vast and heterogeneous archives of data. Much of this knowledge stems from discontinued programs, and reusing these archives is invaluable for reverse translation. However, in practice, such reuse is often infeasible. In this work, we introduce DiscoVerse, a multi-agent co-scientist designed to support pharmaceutical research and development at Roche. Designed as a human-in-the-loop assistant, DiscoVerse enables domain-specific queries by delivering evidence-based answers: it retrieves relevant data, links across documents, summarises key findings and preserves institutional memory. We assess DiscoVerse through expert evaluation of source-linked outputs. Our evaluation spans a selected subset of 180 molecules from Roche's research and development repositories, encompassing over 0.87 billion BPE tokens and more than four decades of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Scientific Computing and Data Management · Machine Learning in Materials Science
