Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural   Networks with Octave Convolution

Yunpeng Chen; Haoqi Fan; Bing Xu; Zhicheng Yan; Yannis Kalantidis,; Marcus Rohrbach; Shuicheng Yan; Jiashi Feng

arXiv:1904.05049·cs.CV·August 20, 2019·151 cites

Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution

Yunpeng Chen, Haoqi Fan, Bing Xu, Zhicheng Yan, Yannis Kalantidis,, Marcus Rohrbach, Shuicheng Yan, Jiashi Feng

PDF

Open Access 5 Repos

TL;DR

This paper introduces Octave Convolution, a novel method that factorizes feature maps by frequency to reduce spatial redundancy, lowering memory and computation costs while improving accuracy in image and video recognition tasks.

Contribution

The paper proposes Octave Convolution, a plug-and-play operation that separates feature maps by frequency, enhancing efficiency and accuracy without altering existing network architectures.

Findings

01

Boosts accuracy in image and video recognition tasks.

02

Reduces memory and computational costs.

03

Achieves 82.9% top-1 accuracy on ImageNet with fewer GFLOPs.

Abstract

In natural images, information is conveyed at different frequencies where higher frequencies are usually encoded with fine details and lower frequencies are usually encoded with global structures. Similarly, the output feature maps of a convolution layer can also be seen as a mixture of information at different frequencies. In this work, we propose to factorize the mixed feature maps by their frequencies, and design a novel Octave Convolution (OctConv) operation to store and process feature maps that vary spatially "slower" at a lower spatial resolution reducing both memory and computation cost. Unlike existing multi-scale methods, OctConv is formulated as a single, generic, plug-and-play convolutional unit that can be used as a direct replacement of (vanilla) convolutions without any adjustments in the network architecture. It is also orthogonal and complementary to methods that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Domain Adaptation and Few-Shot Learning

MethodsResNeXt Block · Depthwise Convolution · Pointwise Convolution · Depthwise Separable Convolution · Octave Convolution · Sigmoid Activation · Dense Connections · Squeeze-and-Excitation Block · Grouped Convolution · Residual Connection