Precise Single-stage Detector

Aisha Chandio; Gong Gui; Teerath Kumar; Irfan Ullah; Ramin; Ranjbarzadeh; Arunabha M Roy; Akhtar Hussain; and Yao Shen

arXiv:2210.04252·cs.CV·October 11, 2022·27 cites

Precise Single-stage Detector

Aisha Chandio, Gong Gui, Teerath Kumar, Irfan Ullah, Ramin, Ranjbarzadeh, Arunabha M Roy, Akhtar Hussain, and Yao Shen

PDF

Open Access

TL;DR

The paper introduces PSSD, a modified single-stage object detector that enhances feature extraction and uses a new loss function, achieving high accuracy and speed on standard benchmarks.

Contribution

It proposes a novel architecture with feature enhancement modules and an IOU-guided loss function to improve detection accuracy and real-time performance.

Findings

01

PSSD achieves 33.8 mAP at 45 FPS on MS COCO with 320px input.

02

PSSD attains 81.28 mAP at 66 FPS on Pascal VOC 2007.

03

The model performs well with larger input sizes, e.g., 37.2 mAP at 27 FPS on MS COCO.

Abstract

There are still two problems in SDD causing some inaccurate results: (1) In the process of feature extraction, with the layer-by-layer acquisition of semantic information, local information is gradually lost, resulting into less representative feature maps; (2) During the Non-Maximum Suppression (NMS) algorithm due to inconsistency in classification and regression tasks, the classification confidence and predicted detection position cannot accurately indicate the position of the prediction boxes. Methods: In order to address these aforementioned issues, we propose a new architecture, a modified version of Single Shot Multibox Detector (SSD), named Precise Single Stage Detector (PSSD). Firstly, we improve the features by adding extra layers to SSD. Secondly, we construct a simple and effective feature enhancement module to expand the receptive field step by step for each layer and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Image Processing Techniques and Applications · Advanced Neural Network Applications

Methods1x1 Convolution · Non Maximum Suppression · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Convolution · SSD