Chain-of-Thought Tokens are Computer Program Variables
Fangwei Zhu, Peiyi Wang, Zhifang Sui

TL;DR
This paper investigates how chain-of-thought tokens in large language models act like variables in computer programs, revealing their role in reasoning tasks and potential limitations.
Contribution
It provides empirical evidence that CoT tokens function as variables, offering new insights into their role and limitations in LLM reasoning processes.
Findings
CoT tokens are essential for complex reasoning tasks.
Intermediate results can be preserved with fewer tokens without performance loss.
Replacing tokens with latent representations does not impair model accuracy.
Abstract
Chain-of-thoughts (CoT) requires large language models (LLMs) to generate intermediate steps before reaching the final answer, and has been proven effective to help LLMs solve complex reasoning tasks. However, the inner mechanism of CoT still remains largely unclear. In this paper, we empirically study the role of CoT tokens in LLMs on two compositional tasks: multi-digit multiplication and dynamic programming. While CoT is essential for solving these problems, we find that preserving only tokens that store intermediate results would achieve comparable performance. Furthermore, we observe that storing intermediate results in an alternative latent form will not affect model performance. We also randomly intervene some values in CoT, and notice that subsequent CoT tokens and the final answer would change correspondingly. These findings suggest that CoT tokens may function like variables…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Big Data and Digital Economy · Machine Learning in Materials Science
