From Blurry to Brilliant Detection: YOLO-Based Aerial Object Detection with Super Resolution
Ragib Amin Nihal, Benjamin Yen, Takeshi Ashizawa, Katsutoshi Itoyama, Kazuhiro Nakadai

TL;DR
This paper introduces a two-stage aerial object detection framework combining super-resolution and an enhanced YOLOv5 model, significantly improving detection accuracy and efficiency for small objects in aerial imagery.
Contribution
It proposes a novel inference-time super-resolution approach integrated with architectural improvements like EAM and CLFPN, enhancing small object detection in aerial images.
Findings
Achieves 52.5% mAP on VisDrone dataset.
Super-resolution preprocessing adds +2.6% mAP.
Architectural innovations contribute +2.9% mAP.
Abstract
Aerial object detection presents challenges from small object sizes, high density clustering, and image quality degradation from distance and motion blur. These factors create an information bottleneck where limited pixel representation cannot encode sufficient discriminative features. B2BDet addresses this with a two-stage framework that applies domain-specific super-resolution during inference, followed by detection using an enhanced YOLOv5 architecture. Unlike training-time super-resolution approaches that enhance learned representations, our method recovers visual information from each input image. The approach combines aerial-optimized SRGAN fine-tuning with architectural innovations including an Efficient Attention Module (EAM) and Cross-Layer Feature Pyramid Network (CLFPN). Evaluation across four aerial datasets shows performance gains, with VisDrone achieving 52.5% mAP using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced SAR Imaging Techniques · Robotics and Sensor-Based Localization
MethodsAttention Is All You Need · Label Smoothing · Absolute Position Encodings · Linear Layer · Dropout · Layer Normalization · Multi-Head Attention · Byte Pair Encoding · Residual Connection · Adam
