Diverse Branch Block: Building a Convolution as an Inception-like Unit
Xiaohan Ding, Xiangyu Zhang, Jungong Han, Guiguang Ding

TL;DR
The paper introduces the Diverse Branch Block (DBB), a universal convolutional building block that enhances CNN performance by integrating diverse branches during training, then converting to a single conv layer for efficient inference.
Contribution
It presents a novel ConvNet building block that improves training performance without inference costs, compatible with any architecture and easily deployable.
Findings
Up to 1.9% higher top-1 accuracy on ImageNet
Improves object detection and semantic segmentation results
Enables training with richer microstructure without inference overhead
Abstract
We propose a universal building block of Convolutional Neural Network (ConvNet) to improve the performance without any inference-time costs. The block is named Diverse Branch Block (DBB), which enhances the representational capacity of a single convolution by combining diverse branches of different scales and complexities to enrich the feature space, including sequences of convolutions, multi-scale convolutions, and average pooling. After training, a DBB can be equivalently converted into a single conv layer for deployment. Unlike the advancements of novel ConvNet architectures, DBB complicates the training-time microstructure while maintaining the macro architecture, so that it can be used as a drop-in replacement for regular conv layers of any architecture. In this way, the model can be trained to reach a higher level of performance and then transformed into the original…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning
MethodsConvolution
