The Observability Gap: Why Output-Level Human Feedback Fails for LLM Coding Agents
Yinghao Wang, Cheng Wang

TL;DR
This paper investigates the limitations of output-only human feedback in guiding LLM-based coding agents, revealing a structural observability gap that hampers effective learning and proposing the need for intermediate feedback mechanisms.
Contribution
The study identifies a fundamental observability gap in output-only feedback for LLM coding agents and demonstrates that adding minimal code-level information can restore effective learning.
Findings
Output-only feedback fails to guide agents to full success in complex tasks.
A structural observability gap causes persistent failure modes.
Adding minimal code-level feedback restores convergence.
Abstract
Large language model (LLM) multi-agent coding systems typically fix agent capabilities at design time. We study an alternative setting, earned autonomy, in which a coding agent starts with zero pre-defined functions and incrementally builds a reusable function library through lightweight human feedback on visual output alone. We evaluate this setup in a Blender-based 3D scene generation task requiring both spatial reasoning and programmatic geometric control. Although the agent rediscovered core utility functions comparable to a human reference implementation, it achieved 0% full-scene success under output-only feedback across multiple instruction granularities, where success required satisfying object completeness, ground contact, collision avoidance, and scale plausibility simultaneously. Our analysis identifies a structural observability gap: bugs originate in code logic and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
