SD-Conv: Towards the Parameter-Efficiency of Dynamic Convolution

Shwai He; Chenbo Jiang; Daize Dong; Liang Ding

arXiv:2204.02227·cs.CV·May 29, 2023

SD-Conv: Towards the Parameter-Efficiency of Dynamic Convolution

Shwai He, Chenbo Jiang, Daize Dong, Liang Ding

PDF

Open Access 1 Video

TL;DR

SD-Conv combines dynamic convolution and unstructured pruning to create a parameter-efficient model that maintains high performance across tasks, reducing parameters and FLOPs effectively.

Contribution

The paper introduces SD-Conv, a novel framework that integrates dynamic convolution with sparsity through binary masking, improving efficiency and performance.

Findings

01

Reduces parameters significantly while maintaining or improving accuracy.

02

Achieves higher performance on ImageNet-1K with fewer parameters.

03

Transfers pretrained models to downstream tasks with consistent improvements.

Abstract

Dynamic convolution achieves better performance for efficient CNNs at the cost of negligible FLOPs increase. However, the performance increase can not match the significantly expanded number of parameters, which is the main bottleneck in real-world applications. Contrastively, mask-based unstructured pruning obtains a lightweight network by removing redundancy in the heavy network. In this paper, we propose a new framework, \textbf{Sparse Dynamic Convolution} (\textsc{SD-Conv}), to naturally integrate these two paths such that it can inherit the advantage of dynamic mechanism and sparsity. We first design a binary mask derived from a learnable threshold to prune static kernels, significantly reducing the parameters and computational cost but achieving higher performance in Imagenet-1K. We further transfer pretrained models into a variety of downstream tasks, showing consistently better…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

SD-Conv: Towards the Parameter-Efficiency of Dynamic Convolution· youtube

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications

MethodsPruning · *Communicated@Fast*How Do I Communicate to Expedia? · Pointwise Convolution · Depthwise Convolution · Average Pooling · Depthwise Separable Convolution · Softmax · Global Average Pooling · Dense Connections · Batch Normalization