Dual-Perspective United Transformer for Object Segmentation in Optical Remote Sensing Images

Yanguang Sun; Jiexi Yan; Jianjun Qian; Chunyan Xu; Jian Yang; and Lei Luo

arXiv:2506.21866·cs.CV·June 30, 2025

Dual-Perspective United Transformer for Object Segmentation in Optical Remote Sensing Images

Yanguang Sun, Jiexi Yan, Jianjun Qian, Chunyan Xu, Jian Yang, and Lei Luo

PDF

Open Access

TL;DR

This paper introduces DPU-Former, a novel Transformer-based model that effectively combines global and local features for improved object segmentation in optical remote sensing images, addressing heterogeneity and complexity issues.

Contribution

The paper proposes a dual-perspective Transformer architecture with global-local mixed attention and a Fourier merging strategy, enhancing feature integration for remote sensing image segmentation.

Findings

01

Outperforms state-of-the-art methods on multiple datasets

02

Effectively integrates long-range dependencies and spatial details

03

Demonstrates superior segmentation accuracy

Abstract

Automatically segmenting objects from optical remote sensing images (ORSIs) is an important task. Most existing models are primarily based on either convolutional or Transformer features, each offering distinct advantages. Exploiting both advantages is valuable research, but it presents several challenges, including the heterogeneity between the two types of features, high complexity, and large parameters of the model. However, these issues are often overlooked in existing the ORSIs methods, causing sub-optimal segmentation. For that, we propose a novel Dual-Perspective United Transformer (DPU-Former) with a unique structure designed to simultaneously integrate long-range dependencies and spatial details. In particular, we design the global-local mixed attention, which captures diverse information through two perspectives and introduces a Fourier-space merging strategy to obviate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Remote-Sensing Image Classification · Domain Adaptation and Few-Shot Learning