Semantic Labeling of High Resolution Images Using EfficientUNets and   Transformers

Hasan AlMarzouqi; Lyes Saad Saoud

arXiv:2206.09731·cs.CV·July 19, 2023

Semantic Labeling of High Resolution Images Using EfficientUNets and Transformers

Hasan AlMarzouqi, Lyes Saad Saoud

PDF

Open Access

TL;DR

This paper introduces a novel segmentation model combining CNNs and transformers for high-resolution remote sensing images, enhancing accuracy by integrating multi-modal data and multi-task strategies.

Contribution

It proposes a new hybrid CNN-transformer model with fusion layers for multi-modal input and multi-task segmentation, improving remote sensing image analysis.

Findings

01

Improved segmentation accuracy over state-of-the-art methods

02

Effective multi-modal data integration via fusion layers

03

Enhanced global and local feature learning

Abstract

Semantic segmentation necessitates approaches that learn high-level characteristics while dealing with enormous amounts of data. Convolutional neural networks (CNNs) can learn unique and adaptive features to achieve this aim. However, due to the large size and high spatial resolution of remote sensing images, these networks cannot analyze an entire scene efficiently. Recently, deep transformers have proven their capability to record global interactions between different objects in the image. In this paper, we propose a new segmentation model that combines convolutional neural networks with transformers, and show that this mixture of local and global feature extraction techniques provides significant advantages in remote sensing segmentation. In addition, the proposed model includes two fusion layers that are designed to represent multi-modal inputs and output of the network efficiently.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Remote-Sensing Image Classification · Advanced Neural Network Applications