Fast Feedforward Networks

Peter Belcak; Roger Wattenhofer

arXiv:2308.14711·cs.LG·September 19, 2023

Fast Feedforward Networks

Peter Belcak, Roger Wattenhofer

PDF

Open Access 4 Repos

TL;DR

The paper introduces Fast Feedforward (FFF) networks, a novel architecture that significantly reduces inference time and computational cost, outperforming traditional feedforward and mixture-of-experts networks while maintaining high predictive accuracy.

Contribution

The paper presents FFF, a new architecture that breaks the linear layer-size inference cost link, achieving log-time inference and high efficiency in vision transformers.

Findings

01

FFF are up to 220x faster than traditional feedforward networks.

02

FFF outperform mixture-of-experts networks by up to 6x in speed.

03

Using only 1% of neurons, FFF retains 94.2% of predictive performance.

Abstract

We break the linear link between the layer size and its inference cost by introducing the fast feedforward (FFF) architecture, a log-time alternative to feedforward networks. We demonstrate that FFFs are up to 220x faster than feedforward networks, up to 6x faster than mixture-of-experts networks, and exhibit better training properties than mixtures of experts thanks to noiseless conditional execution. Pushing FFFs to the limit, we show that they can use as little as 1% of layer neurons for inference in vision transformers while preserving 94.2% of predictive performance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Brain Tumor Detection and Classification · CCD and CMOS Imaging Sensors

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Dense Connections · Layer Normalization · Residual Connection · Vision Transformer · Fast Feedforward Networks