No More Strided Convolutions or Pooling: A New CNN Building Block for Low-Resolution Images and Small Objects
Raja Sunkara, Tie Luo

TL;DR
This paper introduces SPD-Conv, a new CNN building block that replaces strided convolutions and pooling layers, significantly improving performance on low-resolution images and small objects in tasks like classification and detection.
Contribution
The paper proposes SPD-Conv, a novel CNN component that eliminates strided convolutions and pooling, enhancing feature preservation for low-res and small object tasks.
Findings
SPD-Conv improves accuracy on low-resolution images.
Replacing traditional layers boosts small object detection.
New architectures outperform state-of-the-art models.
Abstract
Convolutional neural networks (CNNs) have made resounding success in many computer vision tasks such as image classification and object detection. However, their performance degrades rapidly on tougher tasks where images are of low resolution or objects are small. In this paper, we point out that this roots in a defective yet common design in existing CNN architectures, namely the use of strided convolution and/or pooling layers, which results in a loss of fine-grained information and learning of less effective feature representations. To this end, we propose a new CNN building block called SPD-Conv in place of each strided convolution layer and each pooling layer (thus eliminates them altogether). SPD-Conv is comprised of a space-to-depth (SPD) layer followed by a non-strided convolution (Conv) layer, and can be applied in most if not all CNN architectures. We explain this new design…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Industrial Vision Systems and Defect Detection
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · 1x1 Convolution · Residual Connection · Bottleneck Residual Block · Average Pooling · Max Pooling · Kaiming Initialization · Residual Block · Global Average Pooling
