Measuring and Analyzing Intelligence via Contextual Uncertainty in Large Language Models using Information-Theoretic Metrics
Jae Wan Shim

TL;DR
This paper introduces a novel, task-agnostic method to analyze large language models by examining how their predictive uncertainty decreases with increasing context, revealing stable profiles related to model scale and text complexity.
Contribution
It presents the Entropy Decay Curve and Information Gain Span as new metrics for understanding and comparing the internal information processing of large language models.
Findings
Distinctive, stable uncertainty decay profiles depend on model scale and text complexity.
The proposed metrics effectively differentiate models based on their internal dynamics.
The method provides a new way to analyze AI systems beyond task performance.
Abstract
Large Language Models (LLMs) excel on many task-specific benchmarks, yet the mechanisms that drive this success remain poorly understood. We move from asking what these systems can do to asking how they process information. Our contribution is a task-agnostic method that builds a quantitative Cognitive Profile for any model. The profile is built around the Entropy Decay Curve -- a plot of a model's normalised predictive uncertainty as context length grows. Across several state-of-the-art LLMs and diverse texts, the curves expose distinctive, stable profiles that depend on both model scale and text complexity. We also propose the Information Gain Span (IGS) as a single index that summarises the desirability of a decay pattern. Together, these tools offer a principled way to analyse and compare the internal dynamics of modern AI systems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
