Inferring Functionality of Attention Heads from their Parameters
Amit Elhelo, Mor Geva

TL;DR
This paper introduces MAPS, a framework that infers the functions of attention heads in large language models directly from their parameters, enabling comprehensive analysis without additional training or inference.
Contribution
MAPS provides a novel, efficient method to map attention head functions from parameters, revealing insights into model operation and overlooked functionalities.
Findings
MAPS correlates well with actual head outputs during inference.
It uncovers previously overlooked attention head operations.
The framework offers plausible operation descriptions for most heads.
Abstract
Attention heads are one of the building blocks of large language models (LLMs). Prior work on investigating their operation mostly focused on analyzing their behavior during inference for specific circuits or tasks. In this work, we seek a comprehensive mapping of the operations they implement in a model. We propose MAPS (Mapping Attention head ParameterS), an efficient framework that infers the functionality of attention heads from their parameters, without any model training or inference. We showcase the utility of MAPS for answering two types of questions: (a) given a predefined operation, mapping how strongly heads across the model implement it, and (b) given an attention head, inferring its salient functionality. Evaluating MAPS on 20 operations across 6 popular LLMs shows its estimations correlate with the head's outputs during inference and are causally linked to the model's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsEEG and Brain-Computer Interfaces · Functional Brain Connectivity Studies
MethodsSoftmax · Attention Is All You Need
