STDC-MA Network for Semantic Segmentation

Xiaochun Lei; Linjun Lu; Zetao Jiang; Zhaoting Gong; Chang Lu; Jiaming; Liang

arXiv:2205.04639·cs.CV·May 12, 2022

STDC-MA Network for Semantic Segmentation

Xiaochun Lei, Linjun Lu, Zetao Jiang, Zhaoting Gong, Chang Lu, Jiaming, Liang

PDF

Open Access

TL;DR

The paper introduces the STDC-MA network, combining lightweight structure and multiscale attention to enhance semantic segmentation accuracy, especially for small objects, while maintaining high speed.

Contribution

It proposes a novel STDC-MA network integrating feature alignment and hierarchical multiscale attention for improved segmentation performance.

Findings

01

Achieved 76.81% mIOU on Cityscapes with 0.5x input scale.

02

Improved small object segmentation accuracy.

03

Maintained high segmentation speed.

Abstract

Semantic segmentation is applied extensively in autonomous driving and intelligent transportation with methods that highly demand spatial and semantic information. Here, an STDC-MA network is proposed to meet these demands. First, the STDC-Seg structure is employed in STDC-MA to ensure a lightweight and efficient structure. Subsequently, the feature alignment module (FAM) is applied to understand the offset between high-level and low-level features, solving the problem of pixel offset related to upsampling on the high-level feature map. Our approach implements the effective fusion between high-level features and low-level features. A hierarchical multiscale attention mechanism is adopted to reveal the relationship among attention regions from two different input sizes of one image. Through this relationship, regions receiving much attention are integrated into the segmentation results,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Visual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings