Dynamic Convolution: Attention over Convolution Kernels

Yinpeng Chen; Xiyang Dai; Mengchen Liu; Dongdong Chen; Lu Yuan,; Zicheng Liu

arXiv:1912.03458·cs.CV·April 2, 2020·69 cites

Dynamic Convolution: Attention over Convolution Kernels

Yinpeng Chen, Xiyang Dai, Mengchen Liu, Dongdong Chen, Lu Yuan,, Zicheng Liu

PDF

Open Access 5 Repos 2 Videos

TL;DR

This paper introduces Dynamic Convolution, which enhances lightweight CNNs by aggregating multiple kernels via attention, improving accuracy and representation without increasing network depth or width.

Contribution

The paper proposes Dynamic Convolution, a novel method that increases model capacity through input-dependent kernel aggregation, boosting performance efficiently in lightweight CNNs.

Findings

01

Boosts ImageNet top-1 accuracy by 2.9% with minimal computational overhead.

02

Achieves 2.9 AP improvement on COCO keypoint detection.

03

Enhances MobileNetV3-Small without increasing model size.

Abstract

Light-weight convolutional neural networks (CNNs) suffer performance degradation as their low computational budgets constrain both the depth (number of convolution layers) and the width (number of channels) of CNNs, resulting in limited representation capability. To address this issue, we present Dynamic Convolution, a new design that increases model complexity without increasing the network depth or width. Instead of using a single convolution kernel per layer, dynamic convolution aggregates multiple parallel convolution kernels dynamically based upon their attentions, which are input dependent. Assembling multiple kernels is not only computationally efficient due to the small kernel size, but also has more representation power since these kernels are aggregated in a non-linear way via attention. By simply using dynamic convolution for the state-of-the-art architecture…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Dynamic Convolution: Attention Over Convolution Kernels· youtube

Dynamic Convolution: Attention over Convolution Kernels· youtube

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning

MethodsHow do I resolve a dispute with Expedia?*ResolveFastService · Convolution