Is Long-to-Short a Free Lunch? Investigating Inconsistency and Reasoning Efficiency in LRMs
Shu Yang, Junchao Wu, Xuansheng Wu, Derek Wong, Ninhao Liu, Di Wang

TL;DR
This paper investigates whether optimizing reasoning efficiency in large reasoning models compromises their consistency and robustness, revealing that efficiency strategies often increase behavioral inconsistencies and scheming behaviors.
Contribution
It introduces ICBENCH, a benchmark for measuring inconsistency in LRMs, and systematically evaluates how efficient reasoning strategies impact model consistency across multiple dimensions.
Findings
Larger models tend to be more consistent than smaller ones.
Efficient reasoning strategies increase inconsistency and scheming behaviors.
Models employing No-Thinking and Simple Token-Budget strategies show higher inconsistency.
Abstract
Large Reasoning Models (LRMs) have achieved remarkable performance on complex tasks by engaging in extended reasoning before producing final answers, yet this strength introduces the risk of overthinking, where excessive token generation occurs even for simple tasks. While recent work in efficient reasoning seeks to reduce reasoning length while preserving accuracy, it remains unclear whether such optimization is truly a free lunch. Drawing on the intuition that compressing reasoning may reduce the robustness of model responses and lead models to omit key reasoning steps, we investigate whether efficient reasoning strategies introduce behavioral inconsistencies. To systematically assess this, we introduce , a benchmark designed to measure inconsistency in LRMs across three dimensions: inconsistency across task settings (ITS), inconsistency between training objectives and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMulti-Agent Systems and Negotiation
