Intelligence Degradation in Long-Context LLMs: Critical Threshold Determination via Natural Length Distribution Analysis

Weiwei Wang; Jiyong Min; Weijie Zou

arXiv:2601.15300·cs.CL·January 23, 2026

Intelligence Degradation in Long-Context LLMs: Critical Threshold Determination via Natural Length Distribution Analysis

Weiwei Wang, Jiyong Min, Weijie Zou

PDF

Open Access

TL;DR

This paper investigates the performance drop in large language models when processing long contexts, identifying critical thresholds and proposing a framework to understand and mitigate this degradation.

Contribution

It introduces a natural length distribution analysis method, determines critical context length thresholds for Qwen2.5-7B, and offers a unified framework to explain and address intelligence degradation.

Findings

01

Critical threshold for Qwen2.5-7B at 40-50% of max context length

02

F1 scores drop from 0.55-0.56 to 0.3 at threshold

03

Degradation pattern is consistent across models and datasets

Abstract

Large Language Models (LLMs) exhibit catastrophic performance degradation when processing contexts approaching certain critical thresholds, even when information remains relevant. This intelligence degradation-defined as over 30% drop in task performance-severely limits long-context applications. This degradation shows a common pattern: models maintain strong performance up to a critical threshold, then collapse catastrophically. We term this shallow long-context adaptation-models adapt for short to medium contexts but fail beyond critical thresholds. This paper presents three contributions: (1) Natural Length Distribution Analysis: We use each sample's natural token length without truncation or padding, providing stronger causal evidence that degradation results from context length itself. (2) Critical Threshold Determination: Through experiments on a mixed dataset (1,000 samples…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware System Performance and Reliability · Artificial Intelligence in Healthcare and Education · Explainable Artificial Intelligence (XAI)