Relational reasoning and inductive bias in transformers and large language models
Jesse Geerts, Andrew Liu, Stephanie Chan, Claudia Clopath, Kimberly Stachenfeld

TL;DR
This paper investigates how transformer models perform relational reasoning, especially transitive inference, comparing in-weights and in-context learning mechanisms and their generalization behaviors.
Contribution
It reveals that training regimes and representation structures critically influence transformers' ability for transitive inference, with insights into mechanisms underlying reasoning.
Findings
IWL models learn linear embeddings enabling transitive inference.
ICL models generalize transitively only when data necessitates, often using match-and-copy.
Pre-training ICL on linear tasks enhances reasoning to resemble IWL behaviors.
Abstract
Transformer-based models have demonstrated remarkable reasoning abilities, but the mechanisms underlying relational reasoning remain poorly understood. We investigate how transformers perform \textit{transitive inference}, a classic relational reasoning behavior from psychology which elicits inference about indirectly related items (e.g., if and , then ). We compare in-weights learning (IWL) and in-context learning (ICL) behaviors and mechanisms on these tasks, and fine profoundly different patterns of generalization. IWL models learn a linear embedding, which leads to transitive inference as well as other behavioral effects present in humans and animals. ICL models, in contrast, are capable of learning to generalize transitively, but only do so when it is necessitated by the training data, otherwise learning a match-and-copy strategy. Interestingly, pre-training…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
