ReflCtrl: Controlling LLM Reflection via Representation Engineering
Ge Yan, Chung-En Sun, Tsui-Wei (Lily) Weng

TL;DR
ReflCtrl introduces a representation engineering approach to control self-reflection in large language models, reducing inference costs by selectively managing reflection steps without sacrificing reasoning performance.
Contribution
The paper presents ReflCtrl, a novel framework that identifies and steers reflection behavior in LLMs through latent space manipulation, enabling cost-effective reasoning.
Findings
Reflection steps are often redundant in strong models
Up to 33.6% of reasoning tokens can be saved
Reflection behavior correlates with internal uncertainty signals
Abstract
Large language models (LLMs) with Chain-of-Thought (CoT) reasoning have achieved strong performance across diverse tasks, including mathematics, coding, and general reasoning. A distinctive ability of these reasoning models is self-reflection: the ability to review and revise previous reasoning steps. While self-reflection enhances reasoning performance, it also increases inference cost. In this work, we study self-reflection through the lens of representation engineering. We segment the model's reasoning into steps, identify the steps corresponding to reflection, and extract a reflection direction in the latent space that governs this behavior. Using this direction, we propose a stepwise steering method that can control reflection frequency. We call our framework ReflCtrl. Our experiments show that (1) in many cases reflections are redundant, especially in stronger models (in our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications
