Semantic Segmentation for Urban-Scene Images
Shorya Sharma

TL;DR
This paper enhances urban-scene image semantic segmentation by integrating HANet and WASP into DeepLabv3+, improving accuracy and efficiency for autonomous driving applications.
Contribution
It introduces a novel combination of HANet and WASP with DeepLabv3+ for urban scenes, emphasizing height-driven pattern recognition and computational efficiency.
Findings
Improved mean IoU score over baseline model.
HANet enhances class-specific segmentation accuracy.
WASP reduces training time and model parameters.
Abstract
Urban-scene Image segmentation is an important and trending topic in computer vision with wide use cases like autonomous driving [1]. Starting with the breakthrough work of Long et al. [2] that introduces Fully Convolutional Networks (FCNs), the development of novel architectures and practical uses of neural networks in semantic segmentation has been expedited in the recent 5 years. Aside from seeking solutions in general model design for information shrinkage due to pooling, urban-scene image itself has intrinsic features like positional patterns [3]. Our project seeks an advanced and integrated solution that specifically targets urban-scene image semantic segmentation among the most novel approaches in the current field. We re-implement the cutting edge model DeepLabv3+ [4] with ResNet-101 [5] backbone as our strong baseline model. Based upon DeepLabv3+, we incorporate HANet [3] to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Video Surveillance and Tracking Methods · Automated Road and Building Extraction
MethodsSpatial Pyramid Pooling · Dilated Convolution · Height-driven Attention Network · Atrous Spatial Pyramid Pooling
