A Survey of Retentive Network
Haiqi Yang, Zhiyuan Li, Yi Chang, Yuan Wu

TL;DR
Retentive Network (RetNet) is a neural architecture that offers an efficient, scalable alternative to Transformers by combining recurrence and attention, enabling linear-time inference and broad applicability across domains.
Contribution
This paper provides the first comprehensive survey of RetNet, detailing its core innovations, applications, challenges, and future research directions.
Findings
RetNet achieves efficient long-sequence modeling with linear inference time.
RetNet demonstrates strong cross-domain performance in NLP, speech, and time-series.
The survey highlights key challenges and potential future developments for RetNet.
Abstract
Retentive Network (RetNet) represents a significant advancement in neural network architecture, offering an efficient alternative to the Transformer. While Transformers rely on self-attention to model dependencies, they suffer from high memory costs and limited scalability when handling long sequences due to their quadratic complexity. To mitigate these limitations, RetNet introduces a retention mechanism that unifies the inductive bias of recurrence with the global dependency modeling of attention. This mechanism enables linear-time inference, facilitates efficient modeling of extended contexts, and remains compatible with fully parallelizable training pipelines. RetNet has garnered significant research interest due to its consistently demonstrated cross-domain effectiveness, achieving robust performance across machine learning paradigms including natural language processing, speech…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Domain Adaptation and Few-Shot Learning · Topic Modeling
