Optimizing Anchor-based Detectors for Autonomous Driving Scenes
Xianzhi Du, Wei-Chih Hung, Tsung-Yi Lin

TL;DR
This paper enhances anchor-based detectors for autonomous driving by improving small object detection, applying model scaling for speed-accuracy balance, and achieving state-of-the-art results on the Waymo dataset within real-time constraints.
Contribution
It introduces specific framework improvements and a model scaling strategy tailored for autonomous driving scenes, optimizing detection performance and inference speed.
Findings
Largest Cascade RCNN-RS achieves 76.9% AP/L1 within 70 ms/frame.
Fast RetinaNet-RS runs at 6.3 ms/frame with 50.7% AP/L1.
Models outperform previous methods on the Waymo dataset.
Abstract
This paper summarizes model improvements and inference-time optimizations for the popular anchor-based detectors in the scenes of autonomous driving. Based on the high-performing RCNN-RS and RetinaNet-RS detection frameworks designed for common detection scenes, we study a set of framework improvements to adapt the detectors to better detect small objects in crowd scenes. Then, we propose a model scaling strategy by scaling input resolution and model size to achieve a better speed-accuracy trade-off curve. We evaluate our family of models on the real-time 2D detection track of the Waymo Open Dataset (WOD). Within the 70 ms/frame latency constraint on a V100 GPU, our largest Cascade RCNN-RS model achieves 76.9% AP/L1 and 70.1% AP/L2, attaining the new state-of-the-art on WOD real-time 2D detection. Our fastest RetinaNet-RS model achieves 6.3 ms/frame while maintaining a reasonable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Video Surveillance and Tracking Methods · Autonomous Vehicle Technology and Safety
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Max Pooling · Residual Connection · Convolution · Average Pooling · Residual Block · Batch Normalization · Dense Connections · Xavier Initialization · Global Average Pooling
