Deep multi-task learning for a geographically-regularized semantic segmentation of aerial images
Michele Volpi, Devis Tuia

TL;DR
This paper introduces a multi-task deep learning approach that combines CNNs and conditional random fields for improved semantic segmentation of high-resolution aerial images, effectively integrating visual features and spatial regularization.
Contribution
The paper presents a novel multi-task CNN architecture integrated with a hierarchical CRF for enhanced spatial regularization in aerial image segmentation.
Findings
Outperforms state-of-the-art baselines in segmentation accuracy
Provides flexible framework for combining visual and structural cues
Achieves better regularization through hierarchical spatial constraints
Abstract
When approaching the semantic segmentation of overhead imagery in the decimeter spatial resolution range, successful strategies usually combine powerful methods to learn the visual appearance of the semantic classes (e.g. convolutional neural networks) with strategies for spatial regularization (e.g. graphical models such as conditional random fields). In this paper, we propose a method to learn evidence in the form of semantic class likelihoods, semantic boundaries across classes and shallow-to-deep visual features, each one modeled by a multi-task convolutional neural network architecture. We combine this bottom-up information with top-down spatial regularization encoded by a conditional random field model optimizing the label space across a hierarchy of segments with constraints related to structural, spatial and data-dependent pairwise relationships between regions. Our results show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
