LLM The Genius Paradox: A Linguistic and Math Expert's Struggle with Simple Word-based Counting Problems
Nan Xu, Xuezhe Ma

TL;DR
This paper investigates why large language models struggle with simple counting tasks, challenges prevalent conjectures about their deficiencies, and emphasizes the importance of reasoning in improving model performance.
Contribution
The study designs evaluation settings to test common conjectures about LLM deficiencies and demonstrates that reasoning strategies outperform finetuning and in-context learning for simple counting tasks.
Findings
Prevalent conjectures about inherent LLM deficiencies are invalid.
Engaging reasoning improves counting accuracy more effectively than finetuning.
Transfer of advanced reasoning capabilities to simple tasks is limited.
Abstract
Interestingly, LLMs yet struggle with some basic tasks that humans find trivial to handle, e.g., counting the number of character r's in the word "strawberry". There are several popular conjectures (e.g., tokenization, architecture and training data) regarding the reason for deficiency of LLMs in simple word-based counting problems, sharing the similar belief that such failure stems from model pretraining hence probably inevitable during deployment. In this paper, we carefully design multiple evaluation settings to investigate validity of prevalent conjectures. Meanwhile, we measure transferability of advanced mathematical and coding reasoning capabilities from specialized LLMs to simple counting tasks. Although specialized LLMs suffer from counting problems as well, we find conjectures about inherent deficiency of LLMs invalid and further seek opportunities to elicit knowledge and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Cognitive and developmental aspects of mathematical skills
MethodsSoftmax · Attention Is All You Need
