LeaPformer: Enabling Linear Transformers for Autoregressive and   Simultaneous Tasks via Learned Proportions

Victor Agostinelli; Sanghyun Hong; Lizhong Chen

arXiv:2405.13046·cs.CL·May 24, 2024

LeaPformer: Enabling Linear Transformers for Autoregressive and Simultaneous Tasks via Learned Proportions

Victor Agostinelli, Sanghyun Hong, Lizhong Chen

PDF

Open Access 1 Repo

TL;DR

LeaPformer introduces learned proportions to adapt linear transformers for autoregressive and simultaneous tasks, overcoming length dependence issues and improving efficiency and performance across multiple benchmarks.

Contribution

The paper proposes Learned Proportions (LeaP) and LeaPformers, a novel method that generalizes positional dependence and uses dynamic proportions for flexible attention in linear transformers.

Findings

01

Achieves best quality-throughput trade-off on Long-Range Arena benchmark.

02

Demonstrates competitive results on Wikitext-103 language modeling.

03

Performs well in simultaneous speech-to-text translation tasks.

Abstract

A promising approach to preserving model performance in linearized transformers is to employ position-based re-weighting functions. However, state-of-the-art re-weighting functions rely heavily on target sequence lengths, making it difficult or impossible to apply them to autoregressive and simultaneous tasks, where the target and sometimes even the input sequence length are unknown. To address this issue, we propose Learned Proportions (LeaP) and LeaPformers. Our contribution is built on two major components. First, we generalize the dependence on explicit positional representations and sequence lengths into dependence on sequence proportions for re-weighting. Second, we replace static positional representations with dynamic proportions derived via a compact module, enabling more flexible attention concentration patterns. We evaluate LeaPformer against eight representative efficient…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

osu-starlab/leapformer
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHand Gesture Recognition Systems · Neural Networks and Applications