Generalization or Memorization: Dynamic Decoding for Mode Steering
Xuanming Zhang

TL;DR
This paper introduces a theoretical framework and a novel inference-time algorithm called Dynamic Mode Steering (DMS) to distinguish and control whether large language models generalize or memorize, improving their reliability in critical tasks.
Contribution
It formalizes the distinction between generalization and memorization using the Information Bottleneck principle and develops DMS to steer models towards generalization during inference.
Findings
DMS improves logical consistency in reasoning tasks.
DMS enhances factual accuracy in faithfulness tasks.
Theoretical model links compression to generalization and memorization.
Abstract
Large Language Models (LLMs) exhibit a troubling duality, capable of both remarkable generalization and brittle, verbatim memorization of their training data. This unpredictability undermines their reliability in high-stakes applications. In this work, we propose a unified framework to understand, identify, and control these distinct reasoning modes. First, we introduce a theoretical model based on the Information Bottleneck (IB) principle, formalizing generalization as the learning of a compressed, task-relevant representation and memorization as a failure to compress. Building on this theory, we develop Dynamic Mode Steering (DMS), a novel inference-time algorithm which comprises two components: (1) a lightweight, causally-grounded linear probe that identifies the model's instantaneous reliance on memorization, and (2) a dynamic activation steering mechanism that nudges the model's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
