DySCO: Dynamic Attention-Scaling Decoding for Long-Context Language Models

Xi Ye; Wuwei Zhang; Fangcong Yin; Howard Yen; Danqi Chen

arXiv:2602.22175·cs.CL·April 17, 2026

DySCO: Dynamic Attention-Scaling Decoding for Long-Context Language Models

Xi Ye, Wuwei Zhang, Fangcong Yin, Howard Yen, Danqi Chen

PDF

1 Repo

TL;DR

DySCO is a training-free decoding algorithm that enhances long-context reasoning in language models by dynamically adjusting attention to task-relevant tokens using retrieval heads, leading to significant performance improvements.

Contribution

It introduces DYSCO, a novel, training-free decoding method that leverages retrieval heads to improve long-context reasoning in existing language models.

Findings

01

DYSCO improves performance on long-context reasoning benchmarks by up to 25%.

02

The method is applicable to any off-the-shelf language model.

03

Dynamic attention rescaling and retrieval-head guided selection are key to its effectiveness.

Abstract

Understanding and reasoning over long contexts is a crucial capability for language models (LMs). Although recent models support increasingly long context windows, their accuracy often deteriorates as input length grows. In practice, models often struggle to keep attention aligned with the most relevant context throughout decoding. In this work, we propose DYSCO, a novel decoding algorithm for improving long-context reasoning. DYSCO leverages retrieval heads--a subset of attention heads specialized for longcontext retrieval--to identify task-relevant tokens at each decoding step and explicitly up-weight them. By doing so, DYSCO dynamically adjusts attention during generation to better utilize relevant context. The method is training-free and can be applied directly to any off-the-shelf LMs. Across multiple instruction-tuned and reasoning models, DYSCO consistently improves performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

princeton-pli/DySCO
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.