Matrix-Decoupled Concentration for Autoregressive Sequences: Dimension-Free Guarantees for Sparse Long-Context Rewards

Pei-Sen Li

arXiv:2605.06017·cs.LG·May 19, 2026

Matrix-Decoupled Concentration for Autoregressive Sequences: Dimension-Free Guarantees for Sparse Long-Context Rewards

Pei-Sen Li

PDF

TL;DR

This paper introduces a novel matrix-decoupled concentration inequality for autoregressive sequences, achieving dimension-free guarantees for sparse long-context rewards in large language models.

Contribution

It develops a sharp McDiarmid-type inequality that preserves sparsity and causality, overcoming limitations of previous methods and providing optimal bounds for dependent sequences.

Findings

01

Achieves dimension-free $ ext{O}(1)$ variance proxy for sparse rewards.

02

Recovers optimal constants for Markov chains.

03

Provides order-optimal bounds for causal trees.

Abstract

Sequence-level evaluations in autoregressive Large Language Models (LLMs) rely on highly dependent token generation. Establishing tight concentration bounds for these processes remains a challenge due to two fundamental bottlenecks in existing frameworks: (i) classical inequalities typically separate dependency structures from target sensitivities, leading to a scalar collapse that inflates the variance proxy to a suboptimal $O (N)$ for sparse terminal rewards; (ii) conversely, while certain spatial methods achieve tighter bounds, they lack the strictly causal filtration required by sequential generation, rendering them inapplicable to the autoregressive setting. To resolve both bottlenecks, we establish a sharp McDiarmid-type inequality for dependent sequences, governed strictly by the exact matrix-vector multiplication of the causal dependency resolvent and the target…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.