Deciphering the Factors Influencing the Efficacy of Chain-of-Thought: Probability, Memorization, and Noisy Reasoning
Akshara Prabhakar, Thomas L. Griffiths, R. Thomas McCoy

TL;DR
This paper investigates how factors like output probability, memorization, and noisy reasoning influence the effectiveness of Chain-of-Thought prompting in large language models, revealing that performance depends on both learned patterns and probabilistic reasoning.
Contribution
It provides a detailed case study analyzing the impact of probability, memorization, and noise on CoT performance across multiple LLMs using a symbolic decoding task.
Findings
Performance varies with output probability, e.g., 26% to 70% accuracy in GPT-4.
CoT performance is affected by memorization and probabilistic reasoning.
Factors like task complexity and learned patterns significantly influence reasoning accuracy.
Abstract
Chain-of-Thought (CoT) prompting has been shown to enhance the multi-step reasoning capabilities of Large Language Models (LLMs). However, debates persist about whether LLMs exhibit abstract generalization or rely on shallow heuristics when given CoT prompts. To understand the factors influencing CoT reasoning we provide a detailed case study of the symbolic reasoning task of decoding shift ciphers, where letters are shifted forward some number of steps in the alphabet. We analyze the pattern of results produced by three LLMs -- GPT-4, Claude 3, and Llama 3.1 -- performing this task using CoT prompting. By focusing on a single relatively simple task, we are able to identify three factors that systematically affect CoT performance: the probability of the task's expected output (probability), what the model has implicitly learned during pre-training (memorization), and the number of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMental Health Research Topics · Opinion Dynamics and Social Influence · Advanced Text Analysis Techniques
MethodsAttention Is All You Need · LLaMA · Linear Layer · Multi-Head Attention · Softmax · Byte Pair Encoding · Layer Normalization · Label Smoothing · Absolute Position Encodings · Position-Wise Feed-Forward Layer
