Memorization, Emergence, and Explaining Reversal Failures: A Controlled Study of Relational Semantics in LLMs

Yihua Zhu; Qianying Liu; Jiaxin Wang; Fei Cheng; Chaoran Liu; Akiko Aizawa; Sadao Kurohashi; Hidetoshi Shimodaira

arXiv:2601.02931·cs.CL·April 23, 2026

Memorization, Emergence, and Explaining Reversal Failures: A Controlled Study of Relational Semantics in LLMs

Yihua Zhu, Qianying Liu, Jiaxin Wang, Fei Cheng, Chaoran Liu, Akiko Aizawa, Sadao Kurohashi, Hidetoshi Shimodaira

PDF

TL;DR

This study investigates whether large language models learn relational semantics or are biased by order, using a synthetic framework to analyze memorization, inference, and generalization in controlled settings.

Contribution

It introduces a synthetic, knowledge graph-based framework to systematically evaluate relational semantics and order bias in autoregressive language models.

Findings

01

Relational semantics emerge with sufficient supervision even in shallow models.

02

Successful generalization correlates with stable signals in intermediate layers.

03

Reversal failures are mainly due to order bias, not lack of inversion semantics.

Abstract

Autoregressive LLMs perform well on relational tasks that require linking entities via relational words (e.g., father/son, friend), but it is unclear whether they learn the logical semantics of such relations (e.g., symmetry and inversion logic) and, if so, whether reversal-type failures arise from missing relational semantics or left-to-right order bias. We propose a controlled Knowledge Graph-based synthetic framework that generates text from symmetric/inverse triples, train GPT-style autoregressive models from scratch, and evaluate memorization, logical inference, and in-context generalization to unseen entities to address these questions. We find a sharp phase transition in which relational semantics emerge with sufficient logic-bearing supervision, even in shallow (2-3 layer) models, and that successful generalization aligns with stable intermediate-layer signals. Finally,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.