A Survey of Retentive Network

Haiqi Yang; Zhiyuan Li; Yi Chang; Yuan Wu

arXiv:2506.06708·cs.CL·June 10, 2025

A Survey of Retentive Network

Haiqi Yang, Zhiyuan Li, Yi Chang, Yuan Wu

PDF

Open Access

TL;DR

Retentive Network (RetNet) is a neural architecture that offers an efficient, scalable alternative to Transformers by combining recurrence and attention, enabling linear-time inference and broad applicability across domains.

Contribution

This paper provides the first comprehensive survey of RetNet, detailing its core innovations, applications, challenges, and future research directions.

Findings

01

RetNet achieves efficient long-sequence modeling with linear inference time.

02

RetNet demonstrates strong cross-domain performance in NLP, speech, and time-series.

03

The survey highlights key challenges and potential future developments for RetNet.

Abstract

Retentive Network (RetNet) represents a significant advancement in neural network architecture, offering an efficient alternative to the Transformer. While Transformers rely on self-attention to model dependencies, they suffer from high memory costs and limited scalability when handling long sequences due to their quadratic complexity. To mitigate these limitations, RetNet introduces a retention mechanism that unifies the inductive bias of recurrence with the global dependency modeling of attention. This mechanism enables linear-time inference, facilitates efficient modeling of extended contexts, and remains compatible with fully parallelizable training pipelines. RetNet has garnered significant research interest due to its consistently demonstrated cross-domain effectiveness, achieving robust performance across machine learning paradigms including natural language processing, speech…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Domain Adaptation and Few-Shot Learning · Topic Modeling