TL;DR
This paper reveals that linear operators decoding relations in large language models are primarily property-based, highly compressible, and do not encode distinct relations but rather extract recurring semantic properties, explaining their generalization behavior.
Contribution
It extends the understanding of relation decoding in transformers by showing their property-centric structure and demonstrates their high compressibility using tensor networks.
Findings
Relation decoders are highly compressible with tensor networks.
Linear maps extract recurring semantic properties, not distinct relations.
Decoders generalize only to semantically similar relations.
Abstract
This paper investigates the structure of linear operators introduced in Hernandez et al. [2023] that decode specific relational facts in transformer language models. We extend their single-relation findings to a collection of relations and systematically chart their organization. We show that such collections of relation decoders can be highly compressed by simple order-3 tensor networks without significant loss in decoding accuracy. To explain this surprising redundancy, we develop a cross-evaluation protocol, in which we apply each linear decoder operator to the subjects of every other relation. Our results reveal that these linear maps do not encode distinct relations, but extract recurring, coarse-grained semantic properties (e.g., country of capital city and country of food are both in the country-of-X property). This property-centric structure clarifies both the operators'…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
