Understanding Hidden Computations in Chain-of-Thought Reasoning

Aryasomayajula Ram Bharadwaj

arXiv:2412.04537·cs.CL·December 9, 2024

Understanding Hidden Computations in Chain-of-Thought Reasoning

Aryasomayajula Ram Bharadwaj

PDF

Open Access 1 Repo

TL;DR

This paper explores how transformer models internally process hidden reasoning steps in Chain-of-Thought prompting, revealing that hidden characters can be decoded without performance loss, thus enhancing interpretability.

Contribution

It introduces methods to decode hidden reasoning tokens in transformer models, providing new insights into their internal representations during Chain-of-Thought reasoning.

Findings

01

Hidden characters can be recovered without performance loss

02

Layer-wise representations reveal internal reasoning processes

03

Decoding improves interpretability of model reasoning

Abstract

Chain-of-Thought (CoT) prompting has significantly enhanced the reasoning abilities of large language models. However, recent studies have shown that models can still perform complex reasoning tasks even when the CoT is replaced with filler(hidden) characters (e.g., "..."), leaving open questions about how models internally process and represent reasoning steps. In this paper, we investigate methods to decode these hidden characters in transformer models trained with filler CoT sequences. By analyzing layer-wise representations using the logit lens method and examining token rankings, we demonstrate that the hidden characters can be recovered without loss of performance. Our findings provide insights into the internal mechanisms of transformer models and open avenues for improving interpretability and transparency in language model reasoning.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rokosbasilisk/filler_tokens
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Logic, Reasoning, and Knowledge