Deep multi-task learning for a geographically-regularized semantic   segmentation of aerial images

Michele Volpi; Devis Tuia

arXiv:1808.07675·cs.CV·August 24, 2018

Deep multi-task learning for a geographically-regularized semantic segmentation of aerial images

Michele Volpi, Devis Tuia

PDF

TL;DR

This paper introduces a multi-task deep learning approach that combines CNNs and conditional random fields for improved semantic segmentation of high-resolution aerial images, effectively integrating visual features and spatial regularization.

Contribution

The paper presents a novel multi-task CNN architecture integrated with a hierarchical CRF for enhanced spatial regularization in aerial image segmentation.

Findings

01

Outperforms state-of-the-art baselines in segmentation accuracy

02

Provides flexible framework for combining visual and structural cues

03

Achieves better regularization through hierarchical spatial constraints

Abstract

When approaching the semantic segmentation of overhead imagery in the decimeter spatial resolution range, successful strategies usually combine powerful methods to learn the visual appearance of the semantic classes (e.g. convolutional neural networks) with strategies for spatial regularization (e.g. graphical models such as conditional random fields). In this paper, we propose a method to learn evidence in the form of semantic class likelihoods, semantic boundaries across classes and shallow-to-deep visual features, each one modeled by a multi-task convolutional neural network architecture. We combine this bottom-up information with top-down spatial regularization encoded by a conditional random field model optimizing the label space across a hierarchy of segments with constraints related to structural, spatial and data-dependent pairwise relationships between regions. Our results show…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.