The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"
Lukas Berglund, Meg Tong, Max Kaufmann, Mikita Balesni, Asa Cooper, Stickland, Tomasz Korbak, Owain Evans

TL;DR
Large language models fail to generalize the reverse of learned relationships, a phenomenon termed the Reversal Curse, which persists across models and is unaffected by data augmentation.
Contribution
This paper identifies and characterizes the Reversal Curse, revealing a fundamental limitation in how LLMs learn and generalize relational information.
Findings
Models do not generalize 'A is B' to 'B is A' in training.
In-context learning can infer reverse relationships, unlike finetuning.
The Reversal Curse is consistent across different models and sizes.
Abstract
We expose a surprising failure of generalization in auto-regressive large language models (LLMs). If a model is trained on a sentence of the form "A is B", it will not automatically generalize to the reverse direction "B is A". This is the Reversal Curse. For instance, if a model is trained on "Valentina Tereshkova was the first woman to travel to space", it will not automatically be able to answer the question, "Who was the first woman to travel to space?". Moreover, the likelihood of the correct answer ("Valentina Tershkova") will not be higher than for a random name. Thus, models do not generalize a prevalent pattern in their training set: if "A is B" occurs, "B is A" is more likely to occur. It is worth noting, however, that if "A is B" appears in-context, models can deduce the reverse relationship. We provide evidence for the Reversal Curse by finetuning GPT-3 and Llama-1 on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsLaw, AI, and Intellectual Property
MethodsEmirates Airlines Office in Dubai · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · fail · Attention Dropout · Position-Wise Feed-Forward Layer · Refunds@Expedia|||How do I get a full refund from Expedia? · Absolute Position Encodings · Residual Connection · Adam
