Representational Analysis of Binding in Language Models

Qin Dai; Benjamin Heinzerling; Kentaro Inui

arXiv:2409.05448·cs.CL·October 28, 2024

Representational Analysis of Binding in Language Models

Qin Dai, Benjamin Heinzerling, Kentaro Inui

PDF

Open Access

TL;DR

This paper uncovers how language models internally encode and causally utilize an Ordering ID (OI) to perform entity-attribute binding, advancing understanding of their reasoning mechanisms.

Contribution

It introduces a novel view of the Binding ID mechanism by localizing and proving the causal role of the Ordering ID in language model binding behaviour.

Findings

01

Existence of a low-rank subspace encoding OI in LM activations.

02

Causal effect of OI on entity-attribute binding demonstrated.

03

Manipulating OI representations alters binding outcomes.

Abstract

Entity tracking is essential for complex reasoning. To perform in-context entity tracking, language models (LMs) must bind an entity to its attribute (e.g., bind a container to its content) to recall attribute for a given entity. For example, given a context mentioning ``The coffee is in Box Z, the stone is in Box M, the map is in Box H'', to infer ``Box Z contains the coffee'' later, LMs must bind ``Box Z'' to ``coffee''. To explain the binding behaviour of LMs, existing research introduces a Binding ID mechanism and states that LMs use a abstract concept called Binding ID (BI) to internally mark entity-attribute pairs. However, they have not captured the Ordering ID (OI) from entity activations that directly determines the binding behaviour. In this work, we provide a novel view of the BI mechanism by localizing OI and proving the causality between OI and binding behaviour.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques

MethodsActivation Patching