Unlocking the Secrets of Linear Complexity Sequence Model from A Unified Perspective
Zhen Qin, Xuyang Shen, Dong Li, Weigao Sun, Stan Birchfield, Richard, Hartley, Yiran Zhong

TL;DR
This paper introduces the Linear Complexity Sequence Model (LCSM), unifying various sequence modeling techniques with a three-stage process to improve understanding and performance in language modeling and retrieval tasks.
Contribution
The paper proposes a unified framework for sequence models based on linear complexity, analyzing the impact of each stage and setting to enhance comprehension and performance.
Findings
Data-driven methods improve language modeling.
Hand-crafted methods enhance retrieval tasks.
Comprehensive analysis of stage settings impacts.
Abstract
We present the Linear Complexity Sequence Model (LCSM), a comprehensive solution that unites various sequence modeling techniques with linear complexity, including linear attention, state space model, long convolution, and linear RNN, within a single framework. The goal is to enhance comprehension of these models by analyzing the impact of each component from a cohesive and streamlined viewpoint. Specifically, we segment the modeling processes of these models into three distinct stages: Expand, Oscillation, and Shrink (EOS), with each model having its own specific settings. The Expand stage involves projecting the input signal onto a high-dimensional memory state. This is followed by recursive operations performed on the memory state in the Oscillation stage. Finally, the memory state is projected back to a low-dimensional space in the Shrink stage. We perform comprehensive experiments…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining and Machine Learning Applications
