RECALL: Library-Like Behavior In Language Models is Enhanced by   Self-Referencing Causal Cycles

Munachiso Nwadike; Zangir Iklassov; Toluwani Aremu; Tatsuya Hiraoka,; Velibor Bojkovic; Benjamin Heinzerling; Hilal Alqaubeh; Martin Tak\'a\v{c},; Kentaro Inui

arXiv:2501.13491·cs.CL·January 24, 2025

RECALL: Library-Like Behavior In Language Models is Enhanced by Self-Referencing Causal Cycles

Munachiso Nwadike, Zangir Iklassov, Toluwani Aremu, Tatsuya Hiraoka,, Velibor Bojkovic, Benjamin Heinzerling, Hilal Alqaubeh, Martin Tak\'a\v{c},, Kentaro Inui

PDF

Open Access 1 Repo

TL;DR

This paper introduces the RECALL mechanism, a self-referencing causal cycle that enhances language models' ability to recall prior context by leveraging cycle tokens, challenging the view of the reversal curse as an insurmountable limitation.

Contribution

It proposes the RECALL framework, formalizes its probabilistic basis, and demonstrates how cycle tokens can improve context recall in large language models.

Findings

01

RECALL improves recall of prior context in LLMs.

02

Cycle tokens enable models to bypass the reversal curse.

03

Experimental results validate the effectiveness of RECALL.

Abstract

We introduce the concept of the self-referencing causal cycle (abbreviated RECALL) - a mechanism that enables large language models (LLMs) to bypass the limitations of unidirectional causality, which underlies a phenomenon known as the reversal curse. When an LLM is prompted with sequential data, it often fails to recall preceding context. For example, when we ask an LLM to recall the line preceding "O say does that star-spangled banner yet wave" in the U.S. National Anthem, it often fails to correctly return "Gave proof through the night that our flag was still there" - this is due to the reversal curse. It occurs because language models such as ChatGPT and Llama generate text based on preceding tokens, requiring facts to be learned and reproduced in a consistent token order. While the reversal curse is often viewed as a limitation, we offer evidence of an alternative view: it is not…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

samunaai/remember
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsLLaMA