Awes, Laws, and Flaws From Today's LLM Research

Adrian de Wynter

arXiv:2408.15409·cs.CL·June 3, 2025

Awes, Laws, and Flaws From Today's LLM Research

Adrian de Wynter

PDF

Open Access

TL;DR

This paper critically examines the methodology of recent large language model research, highlighting trends, issues, and the effectiveness of checklists, and offers recommendations for improving research rigor and ethics.

Contribution

It provides a comprehensive analysis of over 2,000 LLM studies, identifying methodological trends and proposing improvements for research practices.

Findings

01

Decline in ethics disclaimers over time

02

Rise in LLMs used as evaluators

03

Increase in claims of reasoning abilities without human validation

Abstract

We perform a critical examination of the scientific methodology behind contemporary large language model (LLM) research. For this we assess over 2,000 research works released between 2020 and 2024 based on criteria typical of what is considered good research (e.g. presence of statistical tests and reproducibility), and cross-validate it with arguments that are at the centre of controversy (e.g., claims of emergent behaviour). We find multiple trends, such as declines in ethics disclaimers, a rise of LLMs as evaluators, and an increase on claims of LLM reasoning abilities without leveraging human evaluation. We note that conference checklists are effective at curtailing some of these issues, but balancing velocity and rigour in research cannot solely rely on these. We tie all these findings to findings from recent meta-reviews and extend recommendations on how to address what does, does…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLegal Education and Practice Innovations · Artificial Intelligence in Law · Law, AI, and Intellectual Property