Nonstandard Errors in AI Agents

Ruijiang Gao; Steven Chong Xiao

arXiv:2603.16744·cs.AI·March 20, 2026

Nonstandard Errors in AI Agents

Ruijiang Gao, Steven Chong Xiao

PDF

Open Access

TL;DR

This paper investigates the variability in results produced by AI coding agents when analyzing the same data, revealing significant nonstandard errors and systematic differences in methodological choices, with implications for AI-driven research reliability.

Contribution

It introduces the concept of nonstandard errors in AI agents, demonstrating how methodological divergence affects empirical results and exploring how peer review influences convergence.

Findings

01

AI agents show substantial variation in analysis choices.

02

Different model families have stable, systematic differences.

03

Peer review has limited impact on reducing variability.

Abstract

We study whether state-of-the-art AI coding agents, given the same data and research question, produce the same empirical results. Deploying 150 autonomous Claude Code agents to independently test six hypotheses about market quality trends in NYSE TAQ data for SPY (2015--2024), we find that AI agents exhibit sizable \textit{nonstandard errors} (NSEs), that is, uncertainty from agent-to-agent variation in analytical choices, analogous to those documented among human researchers. AI agents diverge substantially on measure choice (e.g., autocorrelation vs.\ variance ratio, dollar vs.\ share volume). Different model families (Sonnet 4.6 vs.\ Opus 4.6) exhibit stable ``empirical styles,'' reflecting systematic differences in methodological preferences. In a three-stage feedback protocol, AI peer review (written critiques) has minimal effect on dispersion, whereas exposure to top-rated…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEthics and Social Impacts of AI · Explainable Artificial Intelligence (XAI) · Computational and Text Analysis Methods