Latent Reasoning with Supervised Thinking States
Ido Amos, Avi Caciularu, Mor Geva, Amir Globerson, Jonathan Herzig, Lior Shani, Idan Szpektor

TL;DR
Thinking States enables efficient reasoning in large language models by generating and integrating intermediate thoughts during input processing, improving performance and latency over traditional chain-of-thought methods.
Contribution
It introduces a novel method that performs reasoning concurrently with input processing, allowing learned, natural language-supervised thought sequences to enhance reasoning efficiency.
Findings
Outperforms other latent reasoning methods on multiple tasks.
Narrower gap to chain-of-thought on math problems.
Matches chain-of-thought performance on 2-Hop QA with better latency.
Abstract
Reasoning with a chain-of-thought (CoT) enables Large Language Models (LLMs) to solve complex tasks but incurs significant inference costs due to the generation of long rationales. We propose Thinking States, a method that performs reasoning {\em while} the input is processing. Specifically, Thinking States generates sequences of thinking tokens every few input tokens, transforms the thoughts back into embedding space, and adds them to the following input tokens. This has two key advantages. First, it captures the recurrent nature of CoT, but where the thought tokens are generated as input is processing. Second, since the thoughts are represented as tokens, they can be learned from natural language supervision, and using teacher-forcing, which is parallelizable. Empirically, Thinking States outperforms other latent reasoning methods on multiple reasoning tasks, narrowing the gap to CoT…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Explainable Artificial Intelligence (XAI)
