Linear-Time Self Attention with Codeword Histogram for Efficient   Recommendation

Yongji Wu; Defu Lian; Neil Zhenqiang Gong; Lu Yin; Mingyang Yin,; Jingren Zhou; Hongxia Yang

arXiv:2105.14068·cs.IR·June 1, 2021

Linear-Time Self Attention with Codeword Histogram for Efficient Recommendation

Yongji Wu, Defu Lian, Neil Zhenqiang Gong, Lu Yin, Mingyang Yin,, Jingren Zhou, Hongxia Yang

PDF

1 Repo

TL;DR

LISA introduces a linear-time self-attention mechanism for sequence modeling that maintains full context and significantly improves efficiency and memory usage, enabling scalable recommendation systems.

Contribution

The paper proposes LISA, a novel linear-time self-attention method that combines the effectiveness of vanilla attention with the efficiency of sparse attention, without restrictions on sequence length.

Findings

01

LISA outperforms state-of-the-art efficient attention methods in accuracy.

02

LISA is up to 57x faster than vanilla self-attention.

03

LISA uses less memory, up to 78x more efficient than vanilla self-attention.

Abstract

Self-attention has become increasingly popular in a variety of sequence modeling tasks from natural language processing to recommendation, due to its effectiveness. However, self-attention suffers from quadratic computational and memory complexities, prohibiting its applications on long sequences. Existing approaches that address this issue mainly rely on a sparse attention context, either using a local window, or a permuted bucket obtained by locality-sensitive hashing (LSH) or sorting, while crucial information may be lost. Inspired by the idea of vector quantization that uses cluster centroids to approximate items, we propose LISA (LInear-time Self Attention), which enjoys both the effectiveness of vanilla self-attention and the efficiency of sparse attention. LISA scales linearly with the sequence length, while enabling full contextual attention via computing differentiable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

libertyeagle/LISA
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.