The Illusion of Superposition? A Principled Analysis of Latent Thinking in Language Models
Michael Rizvi-Martel, Guillaume Rabusseau, Marius Mosbach

TL;DR
This paper investigates whether language models utilize superposition in latent reasoning, finding that only models trained from scratch show signs of superposition, while others tend to rely on shortcut solutions.
Contribution
It provides a comprehensive analysis of superposition in latent chain-of-thought reasoning across different training regimes, revealing conditions that promote or inhibit superposition.
Findings
Only models trained from scratch exhibit superposition.
Pretraining biases models toward token commitment, reducing superposition.
Capacity influences the solutions a model prefers.
Abstract
Latent reasoning via continuous chain-of-thoughts (Latent CoT) has emerged as a promising alternative to discrete CoT reasoning. Operating in continuous space increases expressivity and has been hypothesized to enable superposition: the ability to maintain multiple candidate solutions simultaneously within a single representation. Despite theoretical arguments, it remains unclear whether language models actually leverage superposition when reasoning using latent CoTs. We investigate this question across three regimes: a training-free regime that constructs latent thoughts as convex combinations of token embeddings, a fine-tuned regime where a base model is adapted to produce latent thoughts, and a from-scratch regime where a model is trained entirely with latent thoughts to solve a given task. Using Logit Lens and entity-level probing to analyze internal representations, we find that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
