Shift: A Zero FLOP, Zero Parameter Alternative to Spatial Convolutions
Bichen Wu, Alvin Wan, Xiangyu Yue, Peter Jin, Sicheng Zhao, Noah, Golmant, Amir Gholaminejad, Joseph Gonzalez, Kurt Keutzer

TL;DR
This paper introduces a parameter-free, FLOP-free shift operation as an efficient alternative to spatial convolutions in neural networks, reducing parameters and computation while maintaining or improving accuracy across tasks.
Contribution
The authors propose a novel shift-based module that replaces traditional convolutions, enabling end-to-end training with fewer parameters and FLOPs, and demonstrate its effectiveness across multiple domains.
Findings
Reduces model parameters by 60% in ResNet on CIFAR datasets.
Outperforms traditional ResNet models on ImageNet with fewer parameters.
Achieves strong performance in classification, face verification, and style transfer tasks.
Abstract
Neural networks rely on convolutions to aggregate spatial information. However, spatial convolutions are expensive in terms of model size and computation, both of which grow quadratically with respect to kernel size. In this paper, we present a parameter-free, FLOP-free "shift" operation as an alternative to spatial convolutions. We fuse shifts and point-wise convolutions to construct end-to-end trainable shift-based modules, with a hyperparameter characterizing the tradeoff between accuracy and efficiency. To demonstrate the operation's efficacy, we replace ResNet's 3x3 convolutions with shift-based modules for improved CIFAR10 and CIFAR100 accuracy using 60% fewer parameters; we additionally demonstrate the operation's resilience to parameter reduction on ImageNet, outperforming ResNet family members. We finally show the shift operation's applicability across domains, achieving strong…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Face recognition and analysis
MethodsAverage Pooling · *Communicated@Fast*How Do I Communicate to Expedia? · 1x1 Convolution · Batch Normalization · Bottleneck Residual Block · Global Average Pooling · Residual Block · Kaiming Initialization · Max Pooling · Residual Connection
