TL;DR
This paper systematically compares receptive field enlargement methods in semantic segmentation, proposing a novel densely connected architecture that combines residual and dense connectivity, achieving state-of-the-art results with fewer parameters.
Contribution
Introduces FC-DRN, a new architecture combining residual and dense connections, and studies the impact of different receptive field enlargement methods on segmentation performance.
Findings
Downsampling outperforms dilations when training from scratch.
Dilations are beneficial during finetuning.
Coarser representations need fewer refinement steps.
Abstract
State-of-the-art semantic segmentation approaches increase the receptive field of their models by using either a downsampling path composed of poolings/strided convolutions or successive dilated convolutions. However, it is not clear which operation leads to best results. In this paper, we systematically study the differences introduced by distinct receptive field enlargement methods and their impact on the performance of a novel architecture, called Fully Convolutional DenseResNet (FC-DRN). FC-DRN has a densely connected backbone composed of residual networks. Following standard image segmentation architectures, receptive field enlargement operations that change the representation level are interleaved among residual networks. This allows the model to exploit the benefits of both residual and dense connectivity patterns, namely: gradient flow, iterative refinement of representations,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
