All That Glitters is Not Novel: Plagiarism in AI Generated Research
Tarun Gupta, Danish Pruthi

TL;DR
This paper reveals that a significant portion of AI-generated research documents are plagiarized or heavily borrowed from existing work, raising concerns about the authenticity and originality of automated scientific research.
Contribution
It provides empirical evidence of plagiarism in AI-generated research, highlighting the limitations of current detection methods and urging careful evaluation of such automated research outputs.
Findings
24% of evaluated documents are paraphrased or borrowed from existing work
76% of documents show varying degrees of similarity to existing research
Automated plagiarism detectors are ineffective at identifying plagiarized AI-generated research
Abstract
Automating scientific research is considered the final frontier of science. Recently, several papers claim autonomous research agents can generate novel research ideas. Amidst the prevailing optimism, we document a critical concern: a considerable fraction of such research documents are smartly plagiarized. Unlike past efforts where experts evaluate the novelty and feasibility of research ideas, we request experts to operate under a different situational logic: to identify similarities between LLM-generated research documents and existing work. Concerningly, the experts identify of the evaluated research documents to be either paraphrased (with one-to-one methodological mapping), or significantly borrowed from existing work. These reported instances are cross-verified by authors of the source papers. The remaining of documents show varying degrees of similarity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education
