Steering Information Utility in Key-Value Memory for Language Model Post-Training
Chunyuan Deng, Ruidi Chang, Hanjie Chen

TL;DR
This paper introduces InfoSteer, a lightweight post-training method that guides language models to better utilize their stored knowledge, improving performance and interpretability across various models and tasks.
Contribution
The paper proposes InfoSteer, a novel approach that treats FFN layers as key-value memory and encourages their use during post-training, enhancing model performance and interpretability.
Findings
Consistent performance improvements across multiple models and tasks.
Steered models allocate information more efficiently, focusing on meaningful tokens.
Enhanced ability to adapt to in- and out-of-distribution data.
Abstract
Recent advancements in language models (LMs) have marked a shift toward the growing importance of post-training. Yet, post-training approaches such as supervised fine-tuning (SFT) do not guarantee the effective use of knowledge acquired during pretraining. We therefore introduce InfoSteer, a lightweight method that encourages parametric information utilization in LMs during post-training. Specifically, InfoSteer treats the feed-forward network (FFN) layer as associate key-value memory and promotes the use of stored memory vectors via forward-pass interventions or regularization during backpropagation. This simple guidance during post-training phase yields consistent performance improvements across diverse model families -- including Qwen, Gemma and Llama -- spanning 15 downstream tasks in both in-distribution (ID) and out-of-distribution (OOD) evaluations. Beyond performance gains, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
