KV Cache Steering for Controlling Frozen LLMs

Max Belitsky; Dawid J. Kopiczko; Michael Dorkenwald; M. Jehanzeb Mirza; James R. Glass; Cees G. M. Snoek; Yuki M. Asano

arXiv:2507.08799·cs.CL·September 29, 2025

KV Cache Steering for Controlling Frozen LLMs

Max Belitsky, Dawid J. Kopiczko, Michael Dorkenwald, M. Jehanzeb Mirza, James R. Glass, Cees G. M. Snoek, Yuki M. Asano

PDF

3 Reviews

TL;DR

This paper introduces cache steering, a lightweight, one-shot method to implicitly guide frozen language models towards better reasoning and behavior control by manipulating their key-value caches during inference.

Contribution

It presents a novel cache steering technique that induces reasoning in frozen models without fine-tuning, using reasoning traces to construct steering vectors for improved performance.

Findings

01

Enhances reasoning quality and task performance across benchmarks.

02

Scales effectively to larger models and challenging datasets.

03

Enables controllable transfer of reasoning styles.

Abstract

We propose cache steering, a lightweight method for implicit steering of language models via a one-shot intervention applied directly to the key-value cache. To validate its effectiveness, we apply cache steering to induce chain-of-thought reasoning in small language models. Our approach constructs steering vectors from reasoning traces, obtained either from teacher models (e.g., GPT-4o) or existing human annotations, that shift model behavior toward more explicit, multi-step reasoning without fine-tuning or prompt modifications. Experimental evaluations on diverse reasoning benchmarks demonstrate that cache steering improves both the qualitative structure of model reasoning and quantitative task performance. Additional experiments show that the method also scales to larger models and yields further gains on challenging datasets such as GPQA and MATH. Compared to prior activation…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 4Confidence 4

Strengths

1. The proposed Cache Steering is simple and easy-to-implement compared with other steering approaches. With little impact on latency, it shows better potential for real applications. 2. The experiments are comprehensive, covering a wide range of LLMs and various reasoning datasets.

Weaknesses

1. Limited Performance: The proposed approach only achieved average accuracy improvement of less than 1% in Table 1, compared with CoT Prompt. Considering its performance vibration larger than 1% with different hyperparameters in figure 2, I'm not convinced that the proposed approach can further improve reasoning based on the commonly used Zero-shot CoT Prompt. 2. Lack of interpretability analysis: No theoretical or empirical analysis for explaining why and how the proposed approach works, leav

Reviewer 02Rating 4Confidence 5

Strengths

1. Beyond simple reasoning induction, KV cache steering enables controllable transfer of reasoning styles, without requiring any computation heavy re-training of the model. 2. The KV cache steering is an one time effort compared to that with activation steering that is a repetitive effort (applies during each decode state) and requires fewer hyperparameter tuning overhead.

Weaknesses

1. While this work of leveraging steering vector to induce reasoning thoughts in a model is different, there exists other steering method that are applied to reduce the reasoning thoughts (example: [1, 2]). Thus it is important to highlight these works and clearly state steering identification difference for the community to understand the usefulness and differences of such steering. Please add discussion to compare your work with these steerable calibration methods, particularly highlight the

Reviewer 03Rating 4Confidence 5

Strengths

- The proposed cache steering is computationally more efficient than activation steering by removing the need for continuous activation editing. - The paper provides comprehensive evaluations across multiple models and datasets, with well-designed ablations on contrastive pair size, steering strength, and reasoning style. - The writing is clear, and the figures and tables are well-organised, improving readability and accessibility.

Weaknesses

If I have misunderstood any of the following points or if additional evidence can be provided, I would happily reconsider my evaluation. - The central idea of this work is to apply steering to the KV cache rather than to the hidden activations. However, as noted by the authors themselves in the related work section (Liu et al., 2025b), prior studies have already explored KV-cache-based interventions for enhancing reasoning ability. It would therefore be helpful if the authors could clarify what

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.