A Survey on Hypothesis Generation for Scientific Discovery in the Era of Large Language Models
Atilla Kaan Alkan, Shashwat Sourav, Maja Jablonska, Simone Astarita,, Rishabh Chakrabarty, Nikhil Garuda, Pranav Khetarpal, Maciej Pi\'oro,, Dimitrios Tanoglidis, Kartheik G. Iyer, Mugdha S. Polimera, Michael J. Smith,, Tirthankar Ghosal, Marc Huertas-Company, Sandor Kruk

TL;DR
This survey reviews how Large Language Models are used to generate scientific hypotheses, categorizing methods, analyzing quality improvement techniques, and discussing challenges and future research directions.
Contribution
It provides a comprehensive taxonomy of LLM-based hypothesis generation methods and analyzes strategies for enhancing hypothesis quality and evaluation.
Findings
Categorized existing LLM methods for hypothesis generation
Identified techniques for improving hypothesis novelty and reasoning
Discussed key challenges and future directions in the field
Abstract
Hypothesis generation is a fundamental step in scientific discovery, yet it is increasingly challenged by information overload and disciplinary fragmentation. Recent advances in Large Language Models (LLMs) have sparked growing interest in their potential to enhance and automate this process. This paper presents a comprehensive survey of hypothesis generation with LLMs by (i) reviewing existing methods, from simple prompting techniques to more complex frameworks, and proposing a taxonomy that categorizes these approaches; (ii) analyzing techniques for improving hypothesis quality, such as novelty boosting and structured reasoning; (iii) providing an overview of evaluation strategies; and (iv) discussing key challenges and future directions, including multimodal integration and human-AI collaboration. Our survey aims to serve as a reference for researchers exploring LLMs for hypothesis…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Computational and Text Analysis Methods · Artificial Intelligence in Healthcare and Education
