Analyzing the Power of Chain of Thought through Memorization Capabilities
Lijia Yu, Xiao-Shan Gao, Lijun Zhang

TL;DR
This paper investigates whether chain of thought (CoT) enhances the reasoning capabilities of transformers by analyzing their memorization limits, concluding that CoT does not universally improve reasoning power and exploring memorization of infinite datasets.
Contribution
It provides a complete characterization of the memorization capabilities of fixed-precision transformers with and without CoT, including bounds and conditions, and demonstrates limitations in memorizing infinite datasets.
Findings
CoT does not universally enhance transformer reasoning capabilities.
Memorization bounds for finite datasets are proportional to dataset size.
Some infinite reasoning datasets cannot be memorized by transformers with or without CoT.
Abstract
It has been shown that the chain of thought (CoT) can enhance the power of large language models (LLMs) to solve certain mathematical reasoning problems. However, the capacity of CoT is still not fully explored. As an important instance, the following basic question has not yet been answered: Does CoT expand the capability of transformers across all reasoning tasks? We demonstrate that reasoning with transformers is essentially a memorization problem for reasoning datasets. Thus, examining the power of CoT across all reasoning tasks amounts to analyzing the memorization capabilities of CoT transformers. In this paper, we give a complete description of the memorization capabilities of fixed-precision transformers with or without CoT and give a negative answer to the above-mentioned question. Precisely, we first give necessary and sufficient conditions for fixed-precision transformers…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Graph Neural Networks · Machine Learning and Algorithms
