$K$-MSHC: Unmasking Minimally Sufficient Head Circuits in Large Language Models with Experiments on Syntactic Classification Tasks
Pratim Chowdhary, Peter Chin, Deepernab Chakrabarty

TL;DR
This paper introduces a method to identify minimal attention head circuits in large language models for specific tasks, revealing task-specific and partially overlapping neural components in syntactic and arithmetic tasks.
Contribution
The paper proposes the $(m{K}, oldsymbol{ ext{epsilon}})$-Minimum Sufficient Head Circuit ($K$-MSHC) methodology and the Search-K-MSHC algorithm to uncover minimal task-critical attention heads in large language models.
Findings
Distinct task-specific head circuits identified
Early layers dominate grammar tasks, while word problems involve multiple layers
Shared weak heads exist between grammar and arithmetic, but strong heads are task-specific
Abstract
Understanding which neural components drive specific capabilities in mid-sized language models (10B parameters) remains a key challenge. We introduce the -Minimum Sufficient Head Circuit (-MSHC), a methodology to identify minimal sets of attention heads crucial for classification tasks as well as Search-K-MSHC, an efficient algorithm for discovering these circuits. Applying our Search-K-MSHC algorithm to Gemma-9B, we analyze three syntactic task families: grammar acceptability, arithmetic verification, and arithmetic word problems. Our findings reveal distinct task-specific head circuits, with grammar tasks predominantly utilizing early layers, word problems showing pronounced activity in both shallow and deep regions, and arithmetic verification demonstrating a more distributed pattern across the network. We discover non-linear circuit overlap patterns,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFerroelectric and Negative Capacitance Devices · Topic Modeling · Language Development and Disorders
MethodsSoftmax · Attention Is All You Need
