Linear Relational Decoding of Morphology in Language Models

Eric Xia; Jugal Kalita

arXiv:2507.14640·cs.CL·July 22, 2025

Linear Relational Decoding of Morphology in Language Models

Eric Xia, Jugal Kalita

PDF

1 Video

TL;DR

This paper demonstrates that linear transformations in transformer models can accurately decode morphological relationships across languages, revealing interpretable and sparsely encoded conceptual structures in the model's latent space.

Contribution

It introduces a linear decoding method for morphology in language models, showing high accuracy and interpretability across multiple languages and models.

Findings

01

Achieves 90% faithfulness on morphological relations

02

Works effectively across different languages and models

03

Reveals sparse and interpretable encoding of morphology

Abstract

A two-part affine approximation has been found to be a good approximation for transformer computations over certain subject object relations. Adapting the Bigger Analogy Test Set, we show that the linear transformation Ws, where s is a middle layer representation of a subject token and W is derived from model derivatives, is also able to accurately reproduce final object states for many relations. This linear technique is able to achieve 90% faithfulness on morphological relations, and we show similar findings multi-lingually and across models. Our findings indicate that some conceptual relationships in language models, such as morphology, are readily interpretable from latent space, and are sparsely encoded by cross-layer linear transformations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Linear Relational Decoding of Morphology in Language Models· underline