Separating Constraint Compliance from Semantic Accuracy: A Novel Benchmark for Evaluating Instruction-Following Under Compression
Rahul Baxi

TL;DR
This paper introduces the CDCT benchmark to separately evaluate constraint compliance and semantic accuracy in LLMs under prompt compression, revealing a universal U-curve pattern and the impact of RLHF on constraint violations.
Contribution
The paper presents the novel CDCT benchmark and uncovers the orthogonal relationship between constraint compliance and semantic accuracy, along with insights into RLHF's role in constraint violations.
Findings
Universal U-curve pattern in constraint compliance across compression levels
RLHF removal significantly improves constraint compliance
Reasoning models outperform efficient models in instruction-following
Abstract
Large language models (LLMs) exhibit degraded performance under prompt compression, but the mechanisms remain poorly understood. We introduce the Compression-Decay Comprehension Test (CDCT), a benchmark that independently measures constraint compliance (CC) and semantic accuracy (SA) across compression levels. We evaluate 9 frontier LLMs across 8 concepts using 5 compression levels from extreme (c=0.0, ~2 words) to none (c=1.0, ~135 words). A three-judge LLM jury achieves almost perfect inter-rater agreement on CC (Fleiss' \k{appa}=0.90). We observe a universal U-curve pattern in constraint compliance (97.2% prevalence), with violations peaking at medium compression (c=0.5, ~27 words). Counterintuitively, models perform better at extreme compression than medium lengths. The dimensions are statistically orthogonal (r=0.193, p=0.084), with constraint effects 2.9x larger than semantic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)
