Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution
Yunpeng Chen, Haoqi Fan, Bing Xu, Zhicheng Yan, Yannis Kalantidis,, Marcus Rohrbach, Shuicheng Yan, Jiashi Feng

TL;DR
This paper introduces Octave Convolution, a novel method that factorizes feature maps by frequency to reduce spatial redundancy, lowering memory and computation costs while improving accuracy in image and video recognition tasks.
Contribution
The paper proposes Octave Convolution, a plug-and-play operation that separates feature maps by frequency, enhancing efficiency and accuracy without altering existing network architectures.
Findings
Boosts accuracy in image and video recognition tasks.
Reduces memory and computational costs.
Achieves 82.9% top-1 accuracy on ImageNet with fewer GFLOPs.
Abstract
In natural images, information is conveyed at different frequencies where higher frequencies are usually encoded with fine details and lower frequencies are usually encoded with global structures. Similarly, the output feature maps of a convolution layer can also be seen as a mixture of information at different frequencies. In this work, we propose to factorize the mixed feature maps by their frequencies, and design a novel Octave Convolution (OctConv) operation to store and process feature maps that vary spatially "slower" at a lower spatial resolution reducing both memory and computation cost. Unlike existing multi-scale methods, OctConv is formulated as a single, generic, plug-and-play convolutional unit that can be used as a direct replacement of (vanilla) convolutions without any adjustments in the network architecture. It is also orthogonal and complementary to methods that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Domain Adaptation and Few-Shot Learning
MethodsResNeXt Block · Depthwise Convolution · Pointwise Convolution · Depthwise Separable Convolution · Octave Convolution · Sigmoid Activation · Dense Connections · Squeeze-and-Excitation Block · Grouped Convolution · Residual Connection
