Does Code Cleanliness Affect Coding Agents? A Controlled Minimal-Pair Study
Priyansh Trivedi, Olivier Schmitt (SonarSource)

TL;DR
This study investigates whether code cleanliness impacts autonomous coding agents' ability to navigate and modify code, finding that while pass rates remain unaffected, cleaner code reduces operational costs and revisitations.
Contribution
The paper introduces a minimal-pair evaluation protocol to isolate the effect of code cleanliness on AI coding agents, revealing its influence on operational efficiency.
Findings
Code cleanliness does not affect pass rates of agents.
Cleaner code reduces token usage by 7-8%.
Cleaner code decreases file revisitations by 34%.
Abstract
As autonomous coding agents see rapid adoption, their evaluation has primarily focused on task completion rates holding the target codebase fixed. This leaves a critical question unanswered: does the structural and stylistic quality, or ``cleanliness'' of the underlying code affect an agent's ability to navigate and modify it? To isolate the effect of code cleanliness from agent capability, we introduce an evaluation protocol built around minimal pairs: repositories that match on architecture, dependencies, and external behaviour, but differ on static-analysis rule violations and cognitive complexity. The pairs are constructed in both directions, by agent pipelines that either degrade a clean repository or clean a messy one. We author 33 tasks across six such pairs, evaluated through hidden tests at the application's public surface. Across 660 trials with Claude Code, code cleanliness…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
