You Only Scan Once: Efficient Multi-dimension Sequential Modeling with   LightNet

Zhen Qin; Yuxin Mao; Xuyang Shen; Dong Li; Jing Zhang; Yuchao Dai,; Yiran Zhong

arXiv:2405.21022·cs.CL·June 3, 2024

You Only Scan Once: Efficient Multi-dimension Sequential Modeling with LightNet

Zhen Qin, Yuxin Mao, Xuyang Shen, Dong Li, Jing Zhang, Yuchao Dai,, Yiran Zhong

PDF

Open Access

TL;DR

LightNet introduces an efficient multi-dimensional sequential modeling framework that uses an additive linear recurrence and new positional encoding methods, enabling single-scan processing for tasks like image and language modeling with improved speed and versatility.

Contribution

The paper proposes an additive linear recurrence to replace multiplicative recurrence, allowing multi-dimensional data to be processed in a single scan, and introduces LightNet and new positional encodings for enhanced efficiency.

Findings

01

LightNet achieves efficient multi-dimensional modeling with a single scan.

02

The new positional encodings improve positional awareness in multi-dimensional data.

03

Empirical results show LightNet's effectiveness across various tasks.

Abstract

Linear attention mechanisms have gained prominence in causal language models due to their linear computational complexity and enhanced speed. However, the inherent decay mechanism in linear attention presents challenges when applied to multi-dimensional sequence modeling tasks, such as image processing and multi-modal learning. In these scenarios, the utilization of sequential scanning to establish a global receptive field necessitates multiple scans for multi-dimensional data, thereby leading to inefficiencies. This paper identifies the inefficiency caused by a multiplicative linear recurrence and proposes an efficient alternative additive linear recurrence to avoid the issue, as it can handle multi-dimensional data within a single scan. We further develop an efficient multi-dimensional sequential modeling framework called LightNet based on the new recurrence. Moreover, we present two…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications