ROSA-Tuning: Enhancing Long-Context Modeling via Suffix Matching

Yunao Zheng; Xiaojie Wang; Lei Ren; Wei Chen

arXiv:2602.02499·cs.CL·February 5, 2026

ROSA-Tuning: Enhancing Long-Context Modeling via Suffix Matching

Yunao Zheng, Xiaojie Wang, Lei Ren, Wei Chen

PDF

Open Access

TL;DR

ROSA-Tuning introduces a retrieval-based mechanism that significantly improves long-context modeling in pretrained language models, achieving near-global attention performance with high efficiency.

Contribution

It proposes ROSA-Tuning, a novel retrieval-and-recall method that enhances long-context understanding in models without high computational costs.

Findings

01

Restores long-context modeling close to global attention.

02

Maintains computational efficiency comparable to windowed-attention.

03

Achieves strong performance on LongBench benchmarks.

Abstract

Long-context capability and computational efficiency are among the central challenges facing today's large language models. Existing efficient attention methods reduce computational complexity, but they typically suffer from a limited coverage of the model state. This paper proposes ROSA-Tuning, a retrieval-and-recall mechanism for enhancing the long-context modeling ability of pretrained models. Beyond the standard attention mechanism, ROSA-Tuning leverages in parallel a CPU-based ROSA (RWKV Online Suffix Automaton) retrieval module, which efficiently locates historical positions in long contexts that are relevant to the current query, and injects the retrieved information into the model state in a trainable manner; subsequent weighted fusion can then be handled by range-restricted attention. To enable end-to-end training, we employ the binary discretization strategy and the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications