Decoding Knowledge in Large Language Models: A Framework for Categorization and Comprehension
Yanbo Fang, Ruixiang Tang

TL;DR
This paper presents K-(CSA)^2, a framework for categorizing LLM knowledge based on correctness and confidence, revealing how techniques like chain-of-thought prompting influence knowledge structures and layer-wise encoding.
Contribution
Introduces a novel framework for nuanced evaluation of LLM knowledge and analyzes how prompting and feedback modify internal knowledge representations.
Findings
Chain-of-thought prompting improves model performance.
Higher layers encode more high-confidence knowledge.
Layer-wise analysis reveals knowledge confidence distribution.
Abstract
Understanding how large language models (LLMs) acquire, retain, and apply knowledge remains an open challenge. This paper introduces a novel framework, K-(CSA)^2, which categorizes LLM knowledge along two dimensions: correctness and confidence. The framework defines six categories of knowledge, ranging from highly confident correctness to confidently held misconceptions, enabling a nuanced evaluation of model comprehension beyond binary accuracy. Using this framework, we demonstrate how techniques like chain-of-thought prompting and reinforcement learning with human feedback fundamentally alter the knowledge structures of internal (pre-trained) and external (context-dependent) knowledge in LLMs. CoT particularly enhances base model performance and shows synergistic benefits when applied to aligned LLMs. Moreover, our layer-wise analysis reveals that higher layers in LLMs encode more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
MethodsBalanced Selection
