Inferring Functionality of Attention Heads from their Parameters

Amit Elhelo; Mor Geva

arXiv:2412.11965·cs.CL·June 3, 2025

Inferring Functionality of Attention Heads from their Parameters

Amit Elhelo, Mor Geva

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces MAPS, a framework that infers the functions of attention heads in large language models directly from their parameters, enabling comprehensive analysis without additional training or inference.

Contribution

MAPS provides a novel, efficient method to map attention head functions from parameters, revealing insights into model operation and overlooked functionalities.

Findings

01

MAPS correlates well with actual head outputs during inference.

02

It uncovers previously overlooked attention head operations.

03

The framework offers plausible operation descriptions for most heads.

Abstract

Attention heads are one of the building blocks of large language models (LLMs). Prior work on investigating their operation mostly focused on analyzing their behavior during inference for specific circuits or tasks. In this work, we seek a comprehensive mapping of the operations they implement in a model. We propose MAPS (Mapping Attention head ParameterS), an efficient framework that infers the functionality of attention heads from their parameters, without any model training or inference. We showcase the utility of MAPS for answering two types of questions: (a) given a predefined operation, mapping how strongly heads across the model implement it, and (b) given an attention head, inferring its salient functionality. Evaluating MAPS on 20 operations across 6 popular LLMs shows its estimations correlate with the head's outputs during inference and are causally linked to the model's…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

amitelhelo/maps
pytorchOfficial

Videos

Inferring Functionality of Attention Heads from their Parameters· underline

Taxonomy

TopicsEEG and Brain-Computer Interfaces · Functional Brain Connectivity Studies

MethodsSoftmax · Attention Is All You Need