SD-Conv: Towards the Parameter-Efficiency of Dynamic Convolution
Shwai He, Chenbo Jiang, Daize Dong, Liang Ding

TL;DR
SD-Conv combines dynamic convolution and unstructured pruning to create a parameter-efficient model that maintains high performance across tasks, reducing parameters and FLOPs effectively.
Contribution
The paper introduces SD-Conv, a novel framework that integrates dynamic convolution with sparsity through binary masking, improving efficiency and performance.
Findings
Reduces parameters significantly while maintaining or improving accuracy.
Achieves higher performance on ImageNet-1K with fewer parameters.
Transfers pretrained models to downstream tasks with consistent improvements.
Abstract
Dynamic convolution achieves better performance for efficient CNNs at the cost of negligible FLOPs increase. However, the performance increase can not match the significantly expanded number of parameters, which is the main bottleneck in real-world applications. Contrastively, mask-based unstructured pruning obtains a lightweight network by removing redundancy in the heavy network. In this paper, we propose a new framework, \textbf{Sparse Dynamic Convolution} (\textsc{SD-Conv}), to naturally integrate these two paths such that it can inherit the advantage of dynamic mechanism and sparsity. We first design a binary mask derived from a learnable threshold to prune static kernels, significantly reducing the parameters and computational cost but achieving higher performance in Imagenet-1K. We further transfer pretrained models into a variety of downstream tasks, showing consistently better…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
SD-Conv: Towards the Parameter-Efficiency of Dynamic Convolution· youtube
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
MethodsPruning · *Communicated@Fast*How Do I Communicate to Expedia? · Pointwise Convolution · Depthwise Convolution · Average Pooling · Depthwise Separable Convolution · Softmax · Global Average Pooling · Dense Connections · Batch Normalization
