Representational Analysis of Binding in Language Models
Qin Dai, Benjamin Heinzerling, Kentaro Inui

TL;DR
This paper uncovers how language models internally encode and causally utilize an Ordering ID (OI) to perform entity-attribute binding, advancing understanding of their reasoning mechanisms.
Contribution
It introduces a novel view of the Binding ID mechanism by localizing and proving the causal role of the Ordering ID in language model binding behaviour.
Findings
Existence of a low-rank subspace encoding OI in LM activations.
Causal effect of OI on entity-attribute binding demonstrated.
Manipulating OI representations alters binding outcomes.
Abstract
Entity tracking is essential for complex reasoning. To perform in-context entity tracking, language models (LMs) must bind an entity to its attribute (e.g., bind a container to its content) to recall attribute for a given entity. For example, given a context mentioning ``The coffee is in Box Z, the stone is in Box M, the map is in Box H'', to infer ``Box Z contains the coffee'' later, LMs must bind ``Box Z'' to ``coffee''. To explain the binding behaviour of LMs, existing research introduces a Binding ID mechanism and states that LMs use a abstract concept called Binding ID (BI) to internally mark entity-attribute pairs. However, they have not captured the Ordering ID (OI) from entity activations that directly determines the binding behaviour. In this work, we provide a novel view of the BI mechanism by localizing OI and proving the causality between OI and binding behaviour.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
MethodsActivation Patching
