Chain-of-Thought in Large Language Models: Decoding, Projection, and   Activation

Hao Yang; Qianghua Zhao; Lei Li

arXiv:2412.03944·cs.AI·December 6, 2024

Chain-of-Thought in Large Language Models: Decoding, Projection, and Activation

Hao Yang, Qianghua Zhao, Lei Li

PDF

Open Access

TL;DR

This paper investigates the internal mechanisms of chain-of-thought prompting in large language models, revealing how it influences token logits, neuron activation, and model understanding to improve reasoning capabilities.

Contribution

It provides a detailed analysis of decoding, projection, and activation processes, offering new insights into how chain-of-thought prompting enhances reasoning in LLMs.

Findings

01

Models imitate exemplar formats and integrate understanding of questions.

02

Logits become more concentrated during generation with chain-of-thought.

03

Activation of more neurons in final layers indicates extensive knowledge retrieval.

Abstract

Chain-of-Thought prompting has significantly enhanced the reasoning capabilities of large language models, with numerous studies exploring factors influencing its performance. However, the underlying mechanisms remain poorly understood. To further demystify the operational principles, this work examines three key aspects: decoding, projection, and activation, aiming to elucidate the changes that occur within models when employing Chainof-Thought. Our findings reveal that LLMs effectively imitate exemplar formats while integrating them with their understanding of the question, exhibiting fluctuations in token logits during generation but ultimately producing a more concentrated logits distribution, and activating a broader set of neurons in the final layers, indicating more extensive knowledge retrieval compared to standard prompts. Our code and data will be publicly avialable when the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Opinion Dynamics and Social Influence

MethodsSparse Evolutionary Training