A Process-Centric Survey of AI for Scientific Discovery Through the EXHYTE Framework
Md Musaddaqul Hasib, Sumin Jo, Harsh Sinha, Jifeng Song, Arun Das, Zhentao Liu, Hugh Galloway, Huey Huang, Kexun Zhang, Shou-Jiang Gao, Yu-Chiao Chiu, Lei Li, Yufei Huang

TL;DR
This paper introduces the EXHYTE framework, a structured approach to understanding how AI contributes to scientific discovery through iterative cycles of exploration, hypothesis generation, and testing.
Contribution
The paper introduces the EXHYTE cycle, a novel framework that unifies AI-driven scientific discovery into a structured process.
Findings
The EXHYTE cycle identifies mature and underexplored substages in AI-driven discovery.
The framework reveals how AI methods can complement human researchers in a structured workflow.
A website with paper summaries and an interactive survey is provided to support the EXHYTE framework.
Abstract
Large language models (LLMs) and agent systems are increasingly transforming scientific discovery, driving progress across chemistry, biology, materials science, and physics. Yet most existing work and surveys remain fragmented, focusing on isolated tasks such as idea generation or experiment design without addressing how these components fit within the broader discovery process. To bridge this gap, we introduce the EXHYTE cycle, an iterative framework that formalizes scientific discovery as a sequence of Exploration, Hypothesis generation, and Testing. We assembled a corpus of recent studies, distilled recurring strategies that characterize how AI methods contribute to each EXHYTE substage, and organized the literature accordingly to representative strategies and domain-specific advances. This process-centric perspective unifies diverse methodologies under a single structured workflow,…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Scientific Computing and Data Management · Artificial Intelligence in Healthcare and Education
