Fast-FNet: Accelerating Transformer Encoder Models via Efficient Fourier   Layers

Nurullah Sevim; Ege Ozan \"Ozyedek; Furkan \c{S}ahinu\c{c}; Aykut; Ko\c{c}

arXiv:2209.12816·cs.CL·May 17, 2023

Fast-FNet: Accelerating Transformer Encoder Models via Efficient Fourier Layers

Nurullah Sevim, Ege Ozan \"Ozyedek, Furkan \c{S}ahinu\c{c}, Aykut, Ko\c{c}

PDF

Open Access

TL;DR

Fast-FNet introduces efficient Fourier Transform-based methods to replace the attention mechanism in transformer encoders, reducing computational costs and enhancing performance for NLP tasks.

Contribution

The paper proposes novel Fourier Transform deployment strategies in transformer encoders, achieving smaller models, faster training, and improved efficiency over prior Fourier-based models.

Findings

01

Reduced training time and memory usage

02

Smaller model parameters

03

Performance improvements on benchmarks

Abstract

Transformer-based language models utilize the attention mechanism for substantial performance improvements in almost all natural language processing (NLP) tasks. Similar attention structures are also extensively studied in several other areas. Although the attention mechanism enhances the model performances significantly, its quadratic complexity prevents efficient processing of long sequences. Recent works focused on eliminating the disadvantages of computational inefficiency and showed that transformer-based models can still reach competitive results without the attention layer. A pioneering study proposed the FNet, which replaces the attention layer with the Fourier Transform (FT) in the transformer encoder architecture. FNet achieves competitive performances concerning the original transformer encoder model while accelerating training process by removing the computational burden of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Neural Network Applications