Mirage of Mastery: Memorization Tricks LLMs into Artificially Inflated Self-Knowledge
Sahil Kale

TL;DR
This paper reveals that large language models often rely on memorized solutions to falsely inflate their self-assessed reasoning abilities, especially in science and medicine, raising concerns about their trustworthiness.
Contribution
It introduces a novel framework to distinguish genuine reasoning from memorization in LLMs and highlights the impact of memorization on self-knowledge and generalization.
Findings
LLMs show over 45% inconsistency in self-assessment when faced with perturbed tasks.
Memorization significantly influences LLMs' confidence, especially in science and medicine domains.
Current architectures and training patterns have flaws that affect models' self-knowledge stability.
Abstract
When artificial intelligence mistakes memorization for intelligence, it creates a dangerous mirage of reasoning. Existing studies treat memorization and self-knowledge deficits in LLMs as separate issues and do not recognize an intertwining link that degrades the trustworthiness of LLM responses. In our study, we utilize a novel framework to ascertain if LLMs genuinely learn reasoning patterns from training data or merely memorize them to assume competence across problems of similar complexity focused on STEM domains. Our analysis shows a noteworthy problem in generalization: LLMs draw confidence from memorized solutions to infer a higher self-knowledge about their reasoning ability, which manifests as an over 45% inconsistency in feasibility assessments when faced with self-validated, logically coherent task perturbations. This effect is most pronounced in science and medicine domains,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
