Untying the Reversal Curse via Bidirectional Language Model Editing
Jun-Yu Ma, Jia-Chen Gu, Zhen-Hua Ling, Quan Liu, Cong Liu

TL;DR
This paper introduces a new bidirectional evaluation framework and a model editing method to address the reversal curse in large language models, improving their ability to recall knowledge in both directions after editing.
Contribution
It proposes the BAKE benchmark and BIRD method to evaluate and enhance bidirectional knowledge recall in edited language models.
Findings
Current editing methods excel in forward recall but fail in reverse.
BIRD improves bidirectional knowledge recall across multiple LLMs.
Reversibility is a critical aspect of effective model editing.
Abstract
Recent studies have demonstrated that large language models (LLMs) store massive factual knowledge within their parameters. But existing LLMs are prone to hallucinate unintended text due to false or outdated knowledge. Since retraining LLMs is resource intensive, there has been a growing interest in the concept of model editing. Despite the emergence of benchmarks and approaches, these unidirectional editing and evaluation have failed to explore the reversal curse. Intuitively, if "The capital of France is" is edited to be a counterfact "London" within a model, then it should be able to naturally reason and recall the reverse fact, i.e., "London is the capital of" followed by "France" instead of "England". In this paper, we study bidirectional language model editing, aiming to provide rigorous model editing evaluation to assess if edited LLMs can recall the editing knowledge…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
