A Novel Shape Guided Transformer Network for Instance Segmentation in Remote Sensing Images
Dawen Yu, Shunping Ji

TL;DR
This paper introduces a Shape Guided Transformer Network (SGTN) that combines a novel LSwin transformer encoder and a shape guidance module to improve instance segmentation accuracy in remote sensing images, especially in challenging atmospheric conditions.
Contribution
The paper proposes a new transformer encoder, LSwin, with enhanced global perception, and a shape guidance module for better boundary and shape extraction in remote sensing image segmentation.
Findings
LSwin outperforms ResNet and Swin Transformer encoders in efficiency.
SGTN achieves the highest AP scores on multiple remote sensing datasets.
The combined approach effectively emphasizes local shape details and global context.
Abstract
Instance segmentation performance in remote sensing images (RSIs) is significantly affected by two issues: how to extract accurate boundaries of objects from remote imaging through the dynamic atmosphere, and how to integrate the mutual information of related object instances scattered over a vast spatial region. In this study, we propose a novel Shape Guided Transformer Network (SGTN) to accurately extract objects at the instance level. Inspired by the global contextual modeling capacity of the self-attention mechanism, we propose an effective transformer encoder termed LSwin, which incorporates vertical and horizontal 1D global self-attention mechanisms to obtain better global-perception capacity for RSIs than the popular local-shifted-window based Swin Transformer. To achieve accurate instance mask segmentation, we introduce a shape guidance module (SGM) to emphasize the object…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Image Segmentation Techniques · Remote Sensing and Land Use · Image Retrieval and Classification Techniques
MethodsAttention Is All You Need · Byte Pair Encoding · Average Pooling · Linear Layer · Softmax · Dense Connections · Absolute Position Encodings · Dropout · Adam · Residual Connection
