A Global-Local Cross-Attention Network for Ultra-high Resolution Remote Sensing Image Semantic Segmentation
Chen Yi, Shan LianLei

TL;DR
This paper introduces GLCANet, a lightweight global-local cross-attention network that improves semantic segmentation of ultra-high resolution remote sensing images by enhancing feature fusion and computational efficiency.
Contribution
The paper presents a novel dual-stream architecture with self-attention and masked cross-attention mechanisms for efficient global-local feature fusion in UHR remote sensing imagery.
Findings
Outperforms state-of-the-art methods in accuracy
Reduces GPU memory usage significantly
Effectively processes large high-resolution images
Abstract
With the rapid development of ultra-high resolution (UHR) remote sensing technology, the demand for accurate and efficient semantic segmentation has increased significantly. However, existing methods face challenges in computational efficiency and multi-scale feature fusion. To address these issues, we propose GLCANet (Global-Local Cross-Attention Network), a lightweight segmentation framework designed for UHR remote sensing imagery.GLCANet employs a dual-stream architecture to efficiently fuse global semantics and local details while minimizing GPU usage. A self-attention mechanism enhances long-range dependencies, refines global features, and preserves local details for better semantic consistency. A masked cross-attention mechanism also adaptively fuses global-local features, selectively enhancing fine-grained details while exploiting global context to improve segmentation accuracy.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image Fusion Techniques · Remote-Sensing Image Classification · Medical Image Segmentation Techniques
