Jacobian Scopes: token-level causal attributions in LLMs

Toni J.B. Liu; Baran Zadeo\u{g}lu; Nicolas Boull\'e; Rapha\"el Sarfati; Christopher J. Earls

arXiv:2601.16407·cs.CL·March 17, 2026

Jacobian Scopes: token-level causal attributions in LLMs

Toni J.B. Liu, Baran Zadeo\u{g}lu, Nicolas Boull\'e, Rapha\"el Sarfati, Christopher J. Earls

PDF

Open Access

TL;DR

Jacobian Scopes introduces gradient-based, token-level causal attribution methods for interpreting large language models, revealing how individual input tokens influence predictions and uncovering biases and mechanisms across various NLP tasks.

Contribution

The paper presents Jacobian Scopes, a novel suite of gradient-based methods for token-level causal attribution in LLMs, grounded in perturbation theory and information geometry.

Findings

01

Reveals implicit political biases in LLM predictions.

02

Uncovers word- and phrase-level translation strategies.

03

Provides insights into mechanisms of in-context learning.

Abstract

Large language models (LLMs) make next-token predictions based on clues present in their context, such as semantic descriptions and in-context examples. Yet, elucidating which prior tokens most strongly influence a given prediction remains challenging due to the proliferation of layers and attention heads in modern architectures. We propose Jacobian Scopes, a suite of gradient-based, token-level causal attribution methods for interpreting LLM predictions. Grounded in perturbation theory and information geometry, Jacobian Scopes quantify how input tokens influence various aspects of a model's prediction, such as specific logits, the full predictive distribution, and model uncertainty (effective temperature). Through case studies spanning instruction understanding, translation, and in-context learning (ICL), we demonstrate how Jacobian Scopes reveal implicit political biases, uncover…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Computational and Text Analysis Methods · Multimodal Machine Learning Applications