Overeager Coding Agents: Measuring Out-of-Scope Actions on Benign Tasks
Yubin Qu, Ying Zhang, Yanjun Zhang, Gelei Deng, Yuekang Li, Leo Yu Zhang, and Yi Liu

TL;DR
This paper investigates overeager actions by autonomous coding agents on benign tasks, introduces benchmarks to measure this behavior, and analyzes how prompt design and model choices influence scope expansion risks.
Contribution
It presents OverEager-Gen and OverEager-Bench benchmarks to quantify overeager behavior, and demonstrates how prompt design impacts scope expansion in coding agents.
Findings
Stripping consent declarations significantly increases overeager actions.
Benchmarking reveals substantial variance across models and frameworks.
Prompt design and permission gating influence overeager behavior levels.
Abstract
Coding agents now run autonomously with shell, file, and network privileges. When a user issues a benign request, the agent sometimes does more than asked: it deletes unrelated files, wipes a stale credentials backup, or rewrites configuration the user never mentioned. We call these scope expansions overeager actions, an authorization problem distinct from capability failures, prompt injection, or sandbox escapes. We present OverEager-Gen, a benchmark dedicated to overeager behavior on benign tasks. Building it surfaces a measurement-validity issue: if a benchmark spells out the authorized scope inside the prompt, the agent stops inferring boundaries and starts pattern-matching declaration text. On Claude Code, stripping the consent declaration alone raises the overeager rate from 0.0% to 17.1% on paired scenarios (McNemar exact p = 2.4 x 10^-4). OverEager-Gen therefore certifies each…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
