Evaluating Novelty in AI-Generated Research Plans Using Multi-Workflow LLM Pipelines

Devesh Saraogi; Rohit Singhee; Dhruv Kumar

arXiv:2601.09714·cs.CL·January 16, 2026

Evaluating Novelty in AI-Generated Research Plans Using Multi-Workflow LLM Pipelines

Devesh Saraogi, Rohit Singhee, Dhruv Kumar

PDF

Open Access

TL;DR

This paper evaluates whether multi-step, agentic workflows using Large Language Models can produce more novel and feasible research plans compared to single-step prompting, highlighting the importance of workflow design in AI research ideation.

Contribution

It introduces and benchmarks multiple multi-workflow LLM architectures for generating research plans, demonstrating their superior novelty and feasibility over simpler methods.

Findings

01

Decomposition-based workflows achieve high novelty scores (4.17/5).

02

Reflection-based approaches score significantly lower (2.33/5).

03

High-performing workflows maintain feasibility across domains.

Abstract

The integration of Large Language Models (LLMs) into the scientific ecosystem raises fundamental questions about the creativity and originality of AI-generated research. Recent work has identified ``smart plagiarism'' as a concern in single-step prompting approaches, where models reproduce existing ideas with terminological shifts. This paper investigates whether agentic workflows -- multi-step systems employing iterative reasoning, evolutionary search, and recursive decomposition -- can generate more novel and feasible research plans. We benchmark five reasoning architectures: Reflection-based iterative refinement, Sakana AI v2 evolutionary algorithms, Google Co-Scientist multi-agent framework, GPT Deep Research (GPT-5.1) recursive decomposition, and Gemini~3 Pro multimodal long-context pipeline. Using evaluations from thirty proposals each on novelty, feasibility, and impact, we find…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Machine Learning in Materials Science · Scientific Computing and Data Management