Quantifying the Necessity of Chain of Thought through Opaque Serial Depth
Jonah Brown-Cohen, David Lindner, Rohin Shah

TL;DR
This paper introduces the concept of opaque serial depth to quantify the reasoning capacity of large language models, providing bounds and an automated method to analyze their internal reasoning complexity.
Contribution
It formalizes opaque serial depth, computes upper bounds for models like Gemma 3, and releases a tool to measure this depth in various neural networks.
Findings
Gemma 3 models have specific upper bounds on opaque serial depth
Mixture-of-Experts models likely have lower depth than dense models
Opaque serial depth helps understand models' internal reasoning capabilities
Abstract
Large language models (LLMs) tend to externalize their reasoning in their chain of thought, making the chain of thought a good target for monitoring. This is partially an inherent feature of the Transformer architecture: sufficiently long serial cognition must pass through the chain of thought (Korbak et al., 2025). We formalize this argument through the notion of opaque serial depth, given by the length of the longest computation that can be done without the use of interpretable intermediate steps like chain of thought. Given this formalization, we compute numeric upper bounds on the opaque serial depth of Gemma 3 models, as well as asymptotic results for additional architectures beyond standard LLMs. We also open-source an automated method that can calculate upper bounds on the opaque serial depth of arbitrary neural networks, and use it to demonstrate that Mixture-of-Experts models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Ethics and Social Impacts of AI · Topic Modeling
