UVid-Net: Enhanced Semantic Segmentation of UAV Aerial Videos by   Embedding Temporal Information

Girisha S; Ujjwal Verma; Manohara Pai M M; Radhika Pai

arXiv:2011.14284·cs.CV·May 28, 2021

UVid-Net: Enhanced Semantic Segmentation of UAV Aerial Videos by Embedding Temporal Information

Girisha S, Ujjwal Verma, Manohara Pai M M, Radhika Pai

PDF

1 Repo

TL;DR

UVid-Net is a novel CNN architecture that embeds temporal information directly into the encoder for UAV aerial video semantic segmentation, achieving higher accuracy and efficiency without additional computational modules.

Contribution

This work introduces UVid-Net, an encoder-decoder CNN that incorporates temporal data within the encoder, improving segmentation accuracy and efficiency for UAV videos.

Findings

01

Achieved mIoU of 0.79 on ManipalUAVid dataset.

02

Outperformed existing state-of-the-art algorithms.

03

Showed promising results with transfer learning on urban street scenes.

Abstract

Semantic segmentation of aerial videos has been extensively used for decision making in monitoring environmental changes, urban planning, and disaster management. The reliability of these decision support systems is dependent on the accuracy of the video semantic segmentation algorithms. The existing CNN based video semantic segmentation methods have enhanced the image semantic segmentation methods by incorporating an additional module such as LSTM or optical flow for computing temporal dynamics of the video which is a computational overhead. The proposed research work modifies the CNN architecture by incorporating temporal information to improve the efficiency of video semantic segmentation. In this work, an enhanced encoder-decoder based CNN architecture (UVid-Net) is proposed for UAV video semantic segmentation. The encoder of the proposed architecture embeds temporal information…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

uverma/ManipalUAVid
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsConcatenated Skip Connection · Max Pooling · Convolution · *Communicated@Fast*How Do I Communicate to Expedia? · U-Net · Sigmoid Activation · Tanh Activation · Long Short-Term Memory