EvoIdeator: Evolving Scientific Ideas through Checklist-Grounded Reinforcement Learning
Andreas Sauter, Yuyue Zhao, Jacopo Urbani, Wenxiang Hu, Zaiqiao Meng, Lun Zhou, Xiaohui Yan, Yougang Lyu

TL;DR
EvoIdeator introduces a reinforcement learning framework that uses structured, checklist-grounded feedback to systematically evolve scientific ideas, improving idea quality and generalization in autonomous knowledge discovery.
Contribution
The paper presents EvoIdeator, a novel RL approach that incorporates multi-dimensional and fine-grained feedback for scientific idea evolution, outperforming larger models.
Findings
EvoIdeator significantly outperforms larger models on scientific metrics.
The framework generalizes well to external feedback sources.
It enables scalable, self-refining autonomous ideation.
Abstract
Scientific idea generation is a cornerstone of autonomous knowledge discovery, yet the iterative evolution required to transform initial concepts into high-quality research proposals remains a formidable challenge for Large Language Models (LLMs). Existing Reinforcement Learning (RL) paradigms often rely on rubric-based scalar rewards that provide global quality scores but lack actionable granularity. Conversely, language-based refinement methods are typically confined to inference-time prompting, targeting models that are not explicitly optimized to internalize such critiques. To bridge this gap, we propose \textbf{EvoIdeator}, a framework that facilitates the evolution of scientific ideas by aligning the RL training objective with \textbf{checklist-grounded feedback}. EvoIdeator leverages a structured judge model to generate two synergistic signals: (1) \emph{lexicographic rewards}…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Topic Modeling · Domain Adaptation and Few-Shot Learning
