Dense CNN Learning with Equivalent Mappings
Jianxin Wu, Chen-Wei Xie, Jian-Hao Luo

TL;DR
This paper introduces eConv and ePool layers that enable dense, high-accuracy predictions in CNNs while maintaining equivalence to the baseline model, improving transferability and speed.
Contribution
The paper proposes novel eConv and ePool layers that produce dense, equivalent predictions, facilitating parameter transfer and improving accuracy across multiple vision tasks.
Findings
eConv and ePool achieve higher accuracy than baseline CNNs.
The methods enable parameter transfer between dense and original models.
The approach improves both speed and accuracy in various tasks.
Abstract
Large receptive field and dense prediction are both important for achieving high accuracy in pixel labeling tasks such as semantic segmentation. These two properties, however, contradict with each other. A pooling layer (with stride 2) quadruples the receptive field size but reduces the number of predictions to 25\%. Some existing methods lead to dense predictions using computations that are not equivalent to the original model. In this paper, we propose the equivalent convolution (eConv) and equivalent pooling (ePool) layers, leading to predictions that are both dense and equivalent to the baseline CNN model. Dense prediction models learned using eConv and ePool can transfer the baseline CNN's parameters as a starting point, and can inverse transfer the learned parameters in a dense model back to the original one, which has both fast testing speed and high accuracy. The proposed eConv…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Video Surveillance and Tracking Methods
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Convolution
