DiRecNetV2: A Transformer-Enhanced Network for Aerial Disaster Recognition
Demetris Shianios, Panayiotis Kolios, Christos Kyrkou

TL;DR
DiRecNetV2 is a hybrid CNN-transformer model designed for aerial disaster recognition, achieving high accuracy and real-time performance on UAVs, and introduces a new multi-label disaster dataset for benchmarking.
Contribution
The paper presents DiRecNetV2, a novel hybrid model combining CNNs and transformers for UAV-based disaster recognition, and introduces a new multi-label disaster dataset for benchmarking.
Findings
Achieves a weighted F1 score of 0.964 on single-label data.
Maintains 176.13 FPS on Nvidia Orin Jetson device.
Demonstrates adaptability with a 0.614 score on multi-label data.
Abstract
The integration of Unmanned Aerial Vehicles (UAVs) with artificial intelligence (AI) models for aerial imagery processing in disaster assessment, necessitates models that demonstrate exceptional accuracy, computational efficiency, and real-time processing capabilities. Traditionally Convolutional Neural Networks (CNNs), demonstrate efficiency in local feature extraction but are limited by their potential for global context interpretation. On the other hand, Vision Transformers (ViTs) show promise for improved global context interpretation through the use of attention mechanisms, although they still remain underinvestigated in UAV-based disaster response applications. Bridging this research gap, we introduce DiRecNetV2, an improved hybrid model that utilizes convolutional and transformer layers. It merges the inductive biases of CNNs for robust feature extraction with the global context…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSoftmax · Attention Is All You Need · Sparse Evolutionary Training
