Representational Homomorphism Predicts and Improves Compositional Generalization In Transformer Language Model

Zhiyu An; Wan Du

arXiv:2601.18858·cs.LG·March 25, 2026

Representational Homomorphism Predicts and Improves Compositional Generalization In Transformer Language Model

Zhiyu An, Wan Du

PDF

Open Access

TL;DR

This paper introduces Homomorphism Error (HE), a structural metric that predicts and enhances compositional generalization in Transformer language models by aligning their learned representations with linguistic rules.

Contribution

The paper proposes HE as a novel metric for understanding and improving compositional generalization, demonstrating its predictive power and effectiveness as a regularizer in training.

Findings

01

HE predicts out-of-distribution compositional generalization with high correlation (R^2=0.73).

02

HE-regularized training significantly reduces HE and improves OOD accuracy.

03

Model depth has minimal impact on HE and OOD performance.

Abstract

Compositional generalization-the ability to interpret novel combinations of familiar components-remains a persistent challenge for neural networks. Behavioral evaluations reveal \emph{when} models fail but offer limited insight into \emph{why} failures arise at the representational level. We introduce \textit{Homomorphism Error} (HE), a structural metric that measures the inconsistency between a set of established rules for which words combine to form new meaning (linguistic syntax) and model's learned rules for which hidden states combine to form new states (semantic syntax). We formulate this inconsistency as deviations from approximate homomorphisms between the linguistic expression algebra and a model's hidden-state space. We designed experiments to test if i) HE predicts compositional generalization performance, and ii) will regularizing for low HE during training improve such…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Ferroelectric and Negative Capacitance Devices · Topic Modeling