GSS: Gated Subspace Steering for Selective Memorization Mitigation in LLMs
Xuanqi Zhang, Haoyang Shang, Xiaoxiao Li

TL;DR
This paper introduces Gated Subspace Steering (GSS), a context-aware method that selectively mitigates memorization in large language models, improving privacy and generalization with less computational cost.
Contribution
We propose GSS, a novel, efficient, and effective method for targeted memorization mitigation in LLMs based on a probe-and-steer framework and optimal subspace steering.
Findings
GSS matches or exceeds state-of-the-art memorization reduction.
GSS requires 100-1000x less compute than existing methods.
Provides new theoretical insights into neural memorization geometry.
Abstract
Large language models (LLMs) can memorize and reproduce training sequences verbatim -- a tendency that undermines both generalization and privacy. Existing mitigation methods apply interventions uniformly, degrading performance on the majority of tokens that generalize normally. We show empirically that memorization is sparse, intermittent, and token-conditioned, suggesting that effective mitigation requires context-aware intervention rather than static parameter modification. To this end, we propose a novel and effective selective memorization mitigation method -- Gated Subspace Steering (GSS), which decomposes intervention into a probe (detecting memorization-relevant activations) and a steer (applying targeted correction only when the probe exceeds a threshold). The optimal probe-steer pair emerges from a principled optimization framework based on optimal subspace steering.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Graph Neural Networks · Machine Learning in Healthcare
