TL;DR
This paper introduces RLCSF, a framework that leverages compiler and language server feedback to improve coding agents through reinforcement learning, with a CLI tool for robust interaction and supervision.
Contribution
It presents RLCSF and Lanser-CLI, novel tools that utilize compiler diagnostics and language server signals for reinforcement learning in coding agents, ensuring deterministic and replayable supervision.
Findings
RLCSF treats tool interactions as transitions with shaped rewards.
Lanser-CLI converts ephemeral sessions into replayable bundles.
The framework guarantees non-negative rewards for componentwise improvements.
Abstract
Coding agents fail when text-level guesses outrun program facts: they hallucinate APIs, drift to the wrong symbol, and apply edits without evidence that the workspace remains valid. Compilers, type checkers, and language servers already compute the missing supervision signal, in the form of diagnostics, symbol resolution, type information, references, and refactoring preconditions, but expose it through interfaces designed for human-driven IDEs rather than learning loops. We introduce Reinforcement Learning from Compiler and Language Server Feedback (RLCSF) together with Lanser-CLI, a CLI-first orchestration layer that exposes this signal to agents and CI. RLCSF treats each tool interaction as a transition and computes a shaped process reward from deterministic changes in diagnostics, selector confidence, and edit safety. Lanser-CLI, in turn, converts ephemeral LSP sessions into…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
