JEPA-Reasoner: Decoupling Latent Reasoning from Token Generation

Bingyang Kelvin Liu; Ziyu Patrick Chen; David P. Woodruff

arXiv:2512.19171·cs.CL·January 29, 2026

JEPA-Reasoner: Decoupling Latent Reasoning from Token Generation

Bingyang Kelvin Liu, Ziyu Patrick Chen, David P. Woodruff

PDF

Open Access

TL;DR

JEPA-Reasoner introduces a decoupled architecture for language models that separates reasoning from token generation, significantly improving reasoning accuracy and robustness by isolating errors and maintaining multiple hypotheses.

Contribution

This paper presents JEPA-Reasoner, a novel architecture that decouples latent reasoning from token generation, enhancing robustness and reasoning capabilities in language models.

Findings

01

149.5% improvement in 8-shot GSM8K accuracy for a 0.9B model

02

Error containment prevents token errors from affecting reasoning

03

Enables representation of multiple hypotheses via mixed latent vectors

Abstract

Current autoregressive language models couple high-level reasoning and low-level token generation into a single sequential process, making the reasoning trajectory vulnerable to compounding expression errors. We propose JEPA-Reasoner, a novel architectural paradigm that decouples these tasks using a Joint-Embedding Predictive Architecture (JEPA) for pure latent-space reasoning and a separate Talker module for linguistic reconstruction. By isolating the reasoning engine from the discrete token-sampling process, our architecture enables: (1) Error Containment, where token-level failures cannot propagate into the latent reasoning chain; (2) Continuous Guidance, providing the generator with access to the entire lossless reasoning trajectory; and (3) Representation of Uncertainty, allowing the model to maintain multiple hypotheses via mixed latent vectors. Controlled experiments on synthetic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Topic Modeling · Multimodal Machine Learning Applications