Jagged AI in Scientific Peer Review: Evidence from POMP Data Analysis

Jin Wook Lee; William Szegda; Zhisheng Song; Edward L. Ionides

arXiv:2605.07855·stat.AP·May 19, 2026

Jagged AI in Scientific Peer Review: Evidence from POMP Data Analysis

Jin Wook Lee, William Szegda, Zhisheng Song, Edward L. Ionides

PDF

TL;DR

This study investigates the uneven performance of AI tools in scientific peer review, revealing that AI exhibits a jagged capability profile, excelling in technical error detection but struggling with interpretive and narrative assessments.

Contribution

It provides empirical evidence of AI's jagged performance in peer review across diverse tasks and shows that this pattern is inherent to the AI model, not just specific instructions.

Findings

01

AI reviewers detect technical errors effectively

02

AI struggles with interpretive and narrative errors

03

Jagged performance pattern is consistent across AI agents

Abstract

Despite their growing use in academic writing and statistical analysis, the performance of artificial intelligence (AI) tools in scientific peer review remains a largely unexplored area. A key challenge is jagged AI, a phenomenon where AI exhibits strong ability spikes in some domains while remaining deficient in others. To study this jaggedness in a practical data science context, we considered the task of reviewing partially observed Markov process (POMP) data analyses. POMP models, also known as state-space models or hidden Markov models, are used to fit mechanistic dynamic models to time series data in diverse applications including disease transmission, ecological dynamics, and financial risk assessment. High-quality peer review in this area entails assessment of scientific context, identification of errors in implementing complex algorithms, and decisions concerning methodological…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.