Evaluating Contrast Localizer for Identifying Causal Units in Social & Mathematical Tasks in Language Models

Yassine Jamaa; Badr AlKhamissi; Satrajit Ghosh; Martin Schrimpf

arXiv:2508.08276·cs.CL·August 26, 2025

Evaluating Contrast Localizer for Identifying Causal Units in Social & Mathematical Tasks in Language Models

Yassine Jamaa, Badr AlKhamissi, Satrajit Ghosh, Martin Schrimpf

PDF

TL;DR

This paper adapts a neuroscientific contrast localizer to identify causally relevant units in large language and vision-language models for Theory of Mind and mathematical reasoning tasks, revealing surprising results about unit importance.

Contribution

It introduces a contrast localizer method for pinpointing causally relevant units in LLMs and VLMs, and critically evaluates its effectiveness across multiple models and tasks.

Findings

01

Low-activation units sometimes cause larger performance drops.

02

Units from the mathematical localizer often impair ToM performance.

03

Contrast localizers may not reliably identify causally relevant units.

Abstract

This work adapts a neuroscientific contrast localizer to pinpoint causally relevant units for Theory of Mind (ToM) and mathematical reasoning tasks in large language models (LLMs) and vision-language models (VLMs). Across 11 LLMs and 5 VLMs ranging in size from 3B to 90B parameters, we localize top-activated units using contrastive stimulus sets and assess their causal role via targeted ablations. We compare the effect of lesioning functionally selected units against low-activation and randomly selected units on downstream accuracy across established ToM and mathematical benchmarks. Contrary to expectations, low-activation units sometimes produced larger performance drops than the highly activated ones, and units derived from the mathematical localizer often impaired ToM performance more than those from the ToM localizer. These findings call into question the causal relevance of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.