Dispatcher: A Message-Passing Approach To Language Modelling

Alberto Cetoli

arXiv:2105.03994·cs.CL·August 2, 2021

Dispatcher: A Message-Passing Approach To Language Modelling

Alberto Cetoli

PDF

Open Access 1 Repo

TL;DR

This paper introduces the Dispatcher message-passing layer for language modeling, replacing self-attention to improve efficiency while maintaining competitive perplexity, with lower computational and memory complexity.

Contribution

It presents a novel message-passing layer that substitutes self-attention in language models, achieving efficiency gains without sacrificing performance.

Findings

01

Achieves O(N logN) computational complexity

02

Maintains comparable perplexity to existing methods

03

Uses O(N) memory complexity

Abstract

This paper proposes a message-passing mechanism to address language modelling. A new layer type is introduced that aims to substitute self-attention for unidirectional sequence generation tasks. The system is shown to be competitive with existing methods: Given N tokens, the computational complexity is O(N logN) and the memory complexity is O(N) under reasonable assumptions. In the end, the Dispatcher layer is seen to achieve comparable perplexity to prior results while being more efficient.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fractalego/dispatcher
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems