Learning Dilation Factors for Semantic Segmentation of Street Scenes
Yang He, Margret Keuper, Bernt Schiele, Mario Fritz

TL;DR
This paper introduces a method to adaptively learn dilation factors in convolutional neural networks, enhancing semantic segmentation accuracy for street scenes by balancing detail preservation and receptive field size.
Contribution
It proposes a novel approach for learning dilation parameters per channel, replacing fixed values, leading to improved segmentation performance on street scene datasets.
Findings
Improved segmentation accuracy on Cityscapes and Camvid datasets.
Adaptive dilation learning outperforms fixed dilation methods.
Enhanced balance between detail and context in segmentation results.
Abstract
Contextual information is crucial for semantic segmentation. However, finding the optimal trade-off between keeping desired fine details and at the same time providing sufficiently large receptive fields is non trivial. This is even more so, when objects or classes present in an image significantly vary in size. Dilated convolutions have proven valuable for semantic segmentation, because they allow to increase the size of the receptive field without sacrificing image resolution. However, in current state-of-the-art methods, dilation parameters are hand-tuned and fixed. In this paper, we present an approach for learning dilation parameters adaptively per channel, consistently improving semantic segmentation results on street-scene datasets like Cityscapes and Camvid.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Video Surveillance and Tracking Methods · Advanced Image and Video Retrieval Techniques
