DiRecNetV2: A Transformer-Enhanced Network for Aerial Disaster   Recognition

Demetris Shianios; Panayiotis Kolios; Christos Kyrkou

arXiv:2410.13663·cs.CV·October 18, 2024

DiRecNetV2: A Transformer-Enhanced Network for Aerial Disaster Recognition

Demetris Shianios, Panayiotis Kolios, Christos Kyrkou

PDF

TL;DR

DiRecNetV2 is a hybrid CNN-transformer model designed for aerial disaster recognition, achieving high accuracy and real-time performance on UAVs, and introduces a new multi-label disaster dataset for benchmarking.

Contribution

The paper presents DiRecNetV2, a novel hybrid model combining CNNs and transformers for UAV-based disaster recognition, and introduces a new multi-label disaster dataset for benchmarking.

Findings

01

Achieves a weighted F1 score of 0.964 on single-label data.

02

Maintains 176.13 FPS on Nvidia Orin Jetson device.

03

Demonstrates adaptability with a 0.614 score on multi-label data.

Abstract

The integration of Unmanned Aerial Vehicles (UAVs) with artificial intelligence (AI) models for aerial imagery processing in disaster assessment, necessitates models that demonstrate exceptional accuracy, computational efficiency, and real-time processing capabilities. Traditionally Convolutional Neural Networks (CNNs), demonstrate efficiency in local feature extraction but are limited by their potential for global context interpretation. On the other hand, Vision Transformers (ViTs) show promise for improved global context interpretation through the use of attention mechanisms, although they still remain underinvestigated in UAV-based disaster response applications. Bridging this research gap, we introduce DiRecNetV2, an improved hybrid model that utilizes convolutional and transformer layers. It merges the inductive biases of CNNs for robust feature extraction with the global context…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSoftmax · Attention Is All You Need · Sparse Evolutionary Training