LEDOM: Reverse Language Model

Xunjian Yin; Sitao Cheng; Yuxi Xie; Xinyu Hu; Li Lin; Xinyi Wang; Liangming Pan; William Yang Wang; Xiaojun Wan

arXiv:2507.01335·cs.CL·March 4, 2026

LEDOM: Reverse Language Model

Xunjian Yin, Sitao Cheng, Yuxi Xie, Xinyu Hu, Li Lin, Xinyi Wang, Liangming Pan, William Yang Wang, Xiaojun Wan

PDF

Open Access

TL;DR

This paper introduces LEDOM, a large-scale reverse autoregressive language model trained to predict past tokens from future context, revealing unique reasoning abilities and enabling improved output reranking through reverse posterior estimates.

Contribution

The paper presents LEDOM, the first large-scale reverse language model, demonstrating its distinct reasoning capabilities and proposing a novel reranking method called Reverse Reward.

Findings

01

LEDOM develops abductive inference and question synthesis abilities.

02

Reverse Reward improves answer accuracy by up to 15%.

03

Bidirectional scoring reduces hallucinated reasoning in language models.

Abstract

Autoregressive language models are trained exclusively left-to-right. We explore the complementary factorization, training right-to-left at scale, and ask what reasoning patterns emerge when a model conditions on future context to predict the past. We train LEDOM, an open-source purely reverse autoregressive language model (2B/7B parameters, 435B tokens), and find it develops capabilities distinct from forward models, including abductive inference, question synthesis, and natural resolution of the reversal curse. We then explore one application of the reverse model: combining forward likelihood $P (y ∣ x)$ with reverse posterior $P (x ∣ y)$ through noisy channel duality. We propose Reverse Reward, which reranks forward outputs using reverse posterior estimates, and prove that bidirectional scoring penalizes hallucinated reasoning chains whose backward reconstruction degrades.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Machine Learning in Healthcare