MixConv: Mixed Depthwise Convolutional Kernels
Mingxing Tan, Quoc V. Le

TL;DR
This paper introduces MixConv, a novel mixed depthwise convolution technique that combines multiple kernel sizes to improve the accuracy and efficiency of mobile neural networks like MobileNets and MixNets, achieving state-of-the-art results.
Contribution
The paper proposes MixConv, a new convolution method that mixes multiple kernel sizes within a single layer, enhancing model performance and efficiency over existing mobile network architectures.
Findings
MixConv improves accuracy and efficiency of MobileNets.
MixNets outperform previous mobile models on ImageNet.
MixNet-L achieves 78.9% top-1 accuracy on ImageNet.
Abstract
Depthwise convolution is becoming increasingly popular in modern efficient ConvNets, but its kernel size is often overlooked. In this paper, we systematically study the impact of different kernel sizes, and observe that combining the benefits of multiple kernel sizes can lead to better accuracy and efficiency. Based on this observation, we propose a new mixed depthwise convolution (MixConv), which naturally mixes up multiple kernel sizes in a single convolution. As a simple drop-in replacement of vanilla depthwise convolution, our MixConv improves the accuracy and efficiency for existing MobileNets on both ImageNet classification and COCO object detection. To demonstrate the effectiveness of MixConv, we integrate it into AutoML search space and develop a new family of models, named as MixNets, which outperform previous mobile models including MobileNetV2 [20] (ImageNet top-1 accuracy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗timm/mixnet_l.ft_in1kmodel· 1.9k dl1.9k dl
- 🤗timm/mixnet_m.ft_in1kmodel· 291 dl291 dl
- 🤗timm/mixnet_s.ft_in1kmodel· 444 dl444 dl
- 🤗timm/mixnet_xl.ra_in1kmodel· 106 dl106 dl
- 🤗timm/tf_mixnet_l.in1kmodel· 1.2k dl1.2k dl
- 🤗timm/tf_mixnet_m.in1kmodel· 65 dl65 dl
- 🤗timm/tf_mixnet_s.in1kmodel· 89 dl89 dl
- 🤗kadirnar/timm_model_listmodel· ♡ 1♡ 1
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI
MethodsResidual Connection · Residual Block · Grouped Convolution · Sigmoid Activation · Dropout · Pointwise Convolution · *Communicated@Fast*How Do I Communicate to Expedia? · Depthwise Separable Convolution · MobileNetV1 · Squeeze-and-Excitation Block
