Replication Study and Benchmarking of Real-Time Object Detection Models
Pierre-Luc Asselin, Vincent Coulombe, William Guimont-Martin, William Larriv\'ee-Hardy

TL;DR
This paper evaluates the reproducibility and benchmarking of real-time object detection models, highlighting the trade-offs between accuracy and inference speed, and providing a unified pipeline for fair comparison.
Contribution
It introduces a unified training and evaluation pipeline for object detection models and benchmarks several models' accuracy and speed on multiple GPUs.
Findings
Reproduced RTMDet and YOLOv7 match original performance.
Reproduced DETR and ViTDet do not achieve original accuracy or speed.
Large models' speed is severely limited by hardware resources.
Abstract
This work examines the reproducibility and benchmarking of state-of-the-art real-time object detection models. As object detection models are often used in real-world contexts, such as robotics, where inference time is paramount, simply measuring models' accuracy is not enough to compare them. We thus compare a large variety of object detection models' accuracy and inference speed on multiple graphics cards. In addition to this large benchmarking attempt, we also reproduce the following models from scratch using PyTorch on the MS COCO 2017 dataset: DETR, RTMDet, ViTDet and YOLOv7. More importantly, we propose a unified training and evaluation pipeline, based on MMDetection's features, to better compare models. Our implementation of DETR and ViTDet could not achieve accuracy or speed performances comparable to what is declared in the original papers. On the other hand, reproduced RTMDet…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInfrared Target Detection Methodologies · Advanced Image and Video Retrieval Techniques · Advanced Neural Network Applications
MethodsBNB Customer Service Number +1-833-534-1729 · Attention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Label Smoothing · Absolute Position Encodings · Byte Pair Encoding · Adam · Dropout · Multi-Head Attention
