VDD: Varied Drone Dataset for Semantic Segmentation

Wenxiao Cai; Ke Jin; Jinyan Hou; Cong Guo; Letian Wu; Wankou Yang

arXiv:2305.13608·cs.CV·April 1, 2025·1 cites

VDD: Varied Drone Dataset for Semantic Segmentation

Wenxiao Cai, Ke Jin, Jinyan Hou, Cong Guo, Letian Wu, Wankou Yang

PDF

Open Access 1 Repo 3 Datasets

TL;DR

The paper introduces VDD, a large-scale, diverse drone image dataset with high-resolution images across multiple scenes and classes, aiming to improve semantic segmentation accuracy in aerial drone imagery.

Contribution

It provides a new, comprehensive drone dataset with diverse scenes and annotations, addressing limitations of existing datasets and facilitating advancements in drone image segmentation.

Findings

01

Seven state-of-the-art models trained on VDD serve as baselines.

02

VDD covers urban, industrial, rural, and natural scenes.

03

Public availability of the dataset encourages further research.

Abstract

Semantic segmentation of drone images is critical for various aerial vision tasks as it provides essential semantic details to understand scenes on the ground. Ensuring high accuracy of semantic segmentation models for drones requires access to diverse, large-scale, and high-resolution datasets, which are often scarce in the field of aerial image processing. While existing datasets typically focus on urban scenes and are relatively small, our Varied Drone Dataset (VDD) addresses these limitations by offering a large-scale, densely labeled collection of 400 high-resolution images spanning 7 classes. This dataset features various scenes in urban, industrial, rural, and natural areas, captured from different camera angles and under diverse lighting conditions. We also make new annotations to UDD and UAVid, integrating them under VDD annotation standards, to create the Integrated Drone…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

RussRobin/VDD
noneOfficial

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Video Surveillance and Tracking Methods · Domain Adaptation and Few-Shot Learning

MethodsMulti-Head Attention · Attention Is All You Need · Absolute Position Encodings · Softmax · Layer Normalization · Byte Pair Encoding · Dropout · Linear Layer · Label Smoothing · Adam