Detecting Small Objects in Thermal Images Using Single-Shot Detector

Hao Zhang; Xianggong Hong; and Li Zhu

arXiv:2108.11101·cs.CV·August 26, 2021

Detecting Small Objects in Thermal Images Using Single-Shot Detector

Hao Zhang, Xianggong Hong, and Li Zhu

PDF

TL;DR

This paper introduces DDSSD, an improved single-shot detector that enhances small object detection in thermal images and standard datasets by using feature fusion with dilation and deconvolution, achieving high accuracy and speed.

Contribution

The paper proposes a novel feature fusion module combining dilation and deconvolution to significantly improve small object detection in SSD-based models.

Findings

01

Achieves 79.7% mAP on PASCAL VOC2007

02

Attains 28.3% mmAP on MS COCO test-dev at 41 FPS

03

Outperforms state-of-the-art methods on small object detection in thermal images

Abstract

SSD (Single Shot Multibox Detector) is one of the most successful object detectors for its high accuracy and fast speed. However, the features from shallow layer (mainly Conv4_3) of SSD lack semantic information, resulting in poor performance in small objects. In this paper, we proposed DDSSD (Dilation and Deconvolution Single Shot Multibox Detector), an enhanced SSD with a novel feature fusion module which can improve the performance over SSD for small object detection. In the feature fusion module, dilation convolution module is utilized to enlarge the receptive field of features from shallow layer and deconvolution module is adopted to increase the size of feature maps from high layer. Our network achieves 79.7% mAP on PASCAL VOC2007 test and 28.3% mmAP on MS COCO test-dev at 41 FPS with only 300x300 input using a single Nvidia 1080 GPU. Especially, for small objects, DDSSD achieves…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsNon Maximum Suppression · 1x1 Convolution · Convolution · SSD