RCStat: A Statistical Framework for using Relative Contextualization in Transformers
Debabrata Mahapatra, Shubham Agarwal, Apoorv Saxena, Subrata Mitra

TL;DR
RCStat introduces a novel statistical framework that leverages raw attention logits in transformers to improve token importance attribution and enable efficient key-value compression, achieving state-of-the-art results without retraining.
Contribution
It presents RCStat, a new method using raw attention logits for better interpretability and compression in transformers, surpassing existing post-Softmax approaches.
Findings
Enhanced token-, sentence-, and chunk-level explanations.
Significant cache reduction with minimal quality loss.
State-of-the-art performance in compression and attribution benchmarks.
Abstract
Prior work on input-token importance in auto-regressive transformers has relied on Softmax-normalized attention weights, which obscure the richer structure of pre-Softmax query-key logits. We introduce RCStat, a statistical framework that harnesses raw attention logits via Relative Contextualization (RC), a random variable measuring contextual alignment between token segments, and derive an efficient upper bound for RC. We demonstrate two applications: (i) Key-Value compression, where RC-based thresholds drive adaptive key-value eviction for substantial cache reduction with minimal quality loss; and (ii) Attribution, where RC yields higher-fidelity token-, sentence-, and chunk-level explanations than post-Softmax methods. Across question answering, summarization, and attribution benchmarks, RCStat achieves significant empirical gains, delivering state-of-the-art compression and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Anomaly Detection Techniques and Applications
