Dual-Perspective United Transformer for Object Segmentation in Optical Remote Sensing Images
Yanguang Sun, Jiexi Yan, Jianjun Qian, Chunyan Xu, Jian Yang, and Lei Luo

TL;DR
This paper introduces DPU-Former, a novel Transformer-based model that effectively combines global and local features for improved object segmentation in optical remote sensing images, addressing heterogeneity and complexity issues.
Contribution
The paper proposes a dual-perspective Transformer architecture with global-local mixed attention and a Fourier merging strategy, enhancing feature integration for remote sensing image segmentation.
Findings
Outperforms state-of-the-art methods on multiple datasets
Effectively integrates long-range dependencies and spatial details
Demonstrates superior segmentation accuracy
Abstract
Automatically segmenting objects from optical remote sensing images (ORSIs) is an important task. Most existing models are primarily based on either convolutional or Transformer features, each offering distinct advantages. Exploiting both advantages is valuable research, but it presents several challenges, including the heterogeneity between the two types of features, high complexity, and large parameters of the model. However, these issues are often overlooked in existing the ORSIs methods, causing sub-optimal segmentation. For that, we propose a novel Dual-Perspective United Transformer (DPU-Former) with a unique structure designed to simultaneously integrate long-range dependencies and spatial details. In particular, we design the global-local mixed attention, which captures diverse information through two perspectives and introduces a Fourier-space merging strategy to obviate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Remote-Sensing Image Classification · Domain Adaptation and Few-Shot Learning
