ATLAS-RTC: Closing the Loop on LLM Agent Output with Token-Level Runtime Control

Christopher Cruz

arXiv:2603.27905·cs.LG·April 7, 2026

ATLAS-RTC: Closing the Loop on LLM Agent Output with Token-Level Runtime Control

Christopher Cruz

PDF

TL;DR

ATLAS-RTC is a runtime control system for autoregressive language models that enforces structured output during decoding, improving success rates and reducing latency by monitoring and intervening at each step.

Contribution

It introduces a closed-loop runtime control mechanism that detects and corrects decoding drift in real-time, a novel approach compared to static or post-hoc methods.

Findings

01

Increases first-attempt success rates by up to 37.8 percentage points.

02

Reduces latency by up to 88% in failure-prone scenarios.

03

Many failures are due to decoding artifacts, not task misunderstanding.

Abstract

We present ATLAS-RTC, a runtime control system for autoregressive language models that enforces structured output during decoding. ATLAS-RTC monitors generation at each step, detects drift from output contracts using lightweight signals, and applies targeted interventions such as biasing, masking, and rollback. Unlike post-hoc validation or static constrained decoding, it operates in a closed loop, enabling correction before errors materialize. Across structured generation and tool-calling tasks, ATLAS-RTC improves first-attempt success rates by 20 to 37.8 percentage points, with up to 88% latency reduction in failure-dominated settings. Results show that many failures arise from decoding artifacts rather than task misunderstanding, motivating runtime control as a distinct layer in LLM systems.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.