BadScientist: Can a Research Agent Write Convincing but Unsound Papers that Fool LLM Reviewers?
Fengqing Jiang, Yichen Feng, Yuetai Li, Luyao Niu, Basel Alomair, Radha Poovendran

TL;DR
This paper demonstrates that AI-generated research papers can deceive current AI peer review systems, exposing vulnerabilities and highlighting the need for more robust safeguards in scientific publishing.
Contribution
It introduces BadScientist, a framework for generating fabricated papers that can fool AI reviewers, and provides a rigorous evaluation revealing systemic vulnerabilities.
Findings
Fabricated papers achieve high acceptance rates
Reviewers often accept papers despite flagging integrity issues
Detection of fabricated papers remains near random chance
Abstract
The convergence of LLM-powered research assistants and AI-based peer review systems creates a critical vulnerability: fully automated publication loops where AI-generated research is evaluated by AI reviewers without human oversight. We investigate this through \textbf{BadScientist}, a framework that evaluates whether fabrication-oriented paper generation agents can deceive multi-model LLM review systems. Our generator employs presentation-manipulation strategies requiring no real experiments. We develop a rigorous evaluation framework with formal error guarantees (concentration bounds and calibration analysis), calibrated on real data. Our results reveal systematic vulnerabilities: fabricated papers achieve acceptance rates up to . Critically, we identify \textit{concern-acceptance conflict} -- reviewers frequently flag integrity issues yet assign acceptance-level scores. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAcademic integrity and plagiarism · Academic Publishing and Open Access · Expert finding and Q&A systems
