Recurrent Attention Networks for Long-text Modeling
Xianming Li, Zongxi Li, Xiaotian Luo, Haoran Xie, Xing Lee, Yingbin, Zhao, Fu Lee Wang, Qing Li

TL;DR
This paper introduces Recurrent Attention Network (RAN), a novel model that enables efficient, scalable long-text encoding by combining self-attention with recurrent structures, improving performance on classification and sequential tasks.
Contribution
The paper proposes RAN, a new long-document encoding model that allows recurrent self-attention, reducing computational complexity and enabling parallel processing for long texts.
Findings
RAN effectively encodes long texts for classification tasks.
RAN demonstrates competitive performance on sequential tasks.
The model supports scalable, parallel long-text processing.
Abstract
Self-attention-based models have achieved remarkable progress in short-text mining. However, the quadratic computational complexities restrict their application in long text processing. Prior works have adopted the chunking strategy to divide long documents into chunks and stack a self-attention backbone with the recurrent structure to extract semantic representation. Such an approach disables parallelization of the attention mechanism, significantly increasing the training cost and raising hardware requirements. Revisiting the self-attention mechanism and the recurrent structure, this paper proposes a novel long-document encoding model, Recurrent Attention Network (RAN), to enable the recurrent operation of self-attention. Combining the advantages from both sides, the well-designed RAN is capable of extracting global semantics in both token-level and document-level representations,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Neural Networks and Applications · Text and Document Classification Technologies
