LambdaNetworks: Modeling Long-Range Interactions Without Attention
Irwan Bello

TL;DR
LambdaNetworks introduce lambda layers as an efficient alternative to self-attention, capturing long-range interactions in structured data like images, leading to faster and more accurate models.
Contribution
The paper proposes lambda layers that model content and position interactions without attention maps, enabling scalable and efficient neural architectures for vision tasks.
Findings
LambdaNetworks outperform convolutional and attention models on ImageNet and COCO.
LambdaResNets achieve 3.2-4.4x faster inference than EfficientNets.
Training with pseudo-labeled data yields up to 9.5x speed-up.
Abstract
We present lambda layers -- an alternative framework to self-attention -- for capturing long-range interactions between an input and structured contextual information (e.g. a pixel surrounded by other pixels). Lambda layers capture such interactions by transforming available contexts into linear functions, termed lambdas, and applying these linear functions to each input separately. Similar to linear attention, lambda layers bypass expensive attention maps, but in contrast, they model both content and position-based interactions which enables their application to large structured inputs such as images. The resulting neural network architectures, LambdaNetworks, significantly outperform their convolutional and attentional counterparts on ImageNet classification, COCO object detection and COCO instance segmentation, while being more computationally efficient. Additionally, we design…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
MethodsResidual Connection · Max Pooling · Global Average Pooling · Bottleneck Residual Block · Residual Block · Kaiming Initialization · Bitcoin Customer Service Number +1-833-534-1729 · Lambda Layer · Batch Normalization · 1x1 Convolution
