LYNX: Learning Dynamic Exits for Confidence-Controlled Reasoning
\"Omer Faruk Akg\"ul, Yusuf Hakan Kalayc{\i}, Rajgopal Kannan, Willie Neiswanger, Viktor Prasanna

TL;DR
LYNX is an online early-exit mechanism for reasoning models that uses confidence cues and a trained probe to reduce inference tokens significantly while maintaining or improving accuracy across diverse tasks.
Contribution
LYNX introduces a novel confidence-controlled early-exit method that leverages hidden states and a single trained probe, applicable across multiple tasks and model sizes without additional inference overhead.
Findings
Reduces tokens by up to 70% while maintaining accuracy.
Improves accuracy on math benchmarks by up to 12 points.
Transfers zero-shot to non-math tasks with token savings.
Abstract
Large reasoning models achieve strong performance on complex tasks by generating extended chains of thought, but they often "overthink": continuing to reason long after they have enough information to answer correctly. This wastes inference-time compute and can hurt accuracy. Existing attempts to stop early either manipulate decoding with extra sampling and heuristics, rely on auxiliary verifier models, or operate only as post-hoc analysis pipelines without formal guarantees. We introduce LYNX, an online early-exit mechanism that turns a model's own hidden-state awareness into confidence-controlled stopping decisions. LYNX attaches exit decisions to naturally occurring reasoning cues (e.g., "hmm", "wait") during generation, trains a lightweight probe on hidden states at those cue tokens using supervision from forced exits, and wraps the resulting scores in split conformal prediction to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Machine Learning in Healthcare · Multimodal Machine Learning Applications
