Compressed Sensing for Capability Localization in Large Language Models
Anna Bair, Yixuan Even Xu, Mingjie Sun, J. Zico Kolter

TL;DR
This paper demonstrates that specific capabilities of large language models are localized to small, sparse subsets of attention heads, and introduces a compressed sensing method to identify these heads efficiently, revealing a modular organization.
Contribution
The paper presents a novel compressed sensing approach to identify task-specific attention heads, showing capabilities are localized and modular within Transformer models.
Findings
Zeroing five heads can reduce task performance by 65%
Capabilities are localized to small, sparse head subsets
Models range from 1B to 8B parameters, showing consistent localization
Abstract
Large language models (LLMs) exhibit a wide range of capabilities, including mathematical reasoning, code generation, and linguistic behaviors. We show that many capabilities are highly localized to small subsets of attention heads within Transformer architectures. Zeroing out as few as five task-specific heads can degrade performance by up to on standard benchmarks measuring the capability of interest, while largely preserving performance on unrelated tasks. We introduce a compressed sensing based method that exploits the sparsity of these heads to identify them via strategic knockouts and a small number of model evaluations. We validate these findings across Llama and Qwen models ranging from 1B to 8B parameters and a diverse set of capabilities including mathematical abilities and code generation, revealing a modular organization in which specialized capabilities are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
