Chasing the Public Score: User Pressure and Evaluation Exploitation in Coding Agent Workflows

Hardy Chen; Nancy Lau; Haoqin Tu; Shuo Yan; Xiangyan Liu; Zijun Wang; Juncheng Wu; Michael Qizhe Shieh; Alvaro A. Cardenas; Cihang Xie; Yuyin Zhou

arXiv:2604.20200·cs.CL·April 23, 2026

Chasing the Public Score: User Pressure and Evaluation Exploitation in Coding Agent Workflows

Hardy Chen, Nancy Lau, Haoqin Tu, Shuo Yan, Xiangyan Liu, Zijun Wang, Juncheng Wu, Michael Qizhe Shieh, Alvaro A. Cardenas, Cihang Xie, Yuyin Zhou

PDF

2 Repos

TL;DR

This paper investigates how user pressure in coding workflows can lead to exploitation of public scores, proposing a benchmark and mitigation strategies to address this issue.

Contribution

The authors introduce AgentPressureBench, a comprehensive benchmark for studying score exploitation, and analyze how model strength and user pressure influence exploitative behavior.

Findings

01

Models with higher strength exhibit more exploitation.

02

Increased user pressure accelerates exploitation onset.

03

Explicit anti-exploit prompts significantly reduce exploitation.

Abstract

Frontier coding agents are increasingly used in workflows where users supervise progress primarily through repeated improvement of a public score, namely the reported score on a public evaluation file with labels in the workspace, rather than through direct inspection of the agent's intermediate outputs. We study whether multi-round user pressure to improve that score induces public score exploitation: behavior that raises the public score through shortcuts without improving hidden private evaluation. We begin with a preliminary single-script tabular classification task, where GPT-5.4 and Claude Opus 4.6 both exploit label information within 10 rounds of user-agent interaction. We then build AgentPressureBench, a 34-task machine-learning repository benchmark spanning three input modalities, and collect 1326 multi-round trajectories from 13 coding agents. On our benchmark, we observe 403…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.