Real-Time Anchor-Free Single-Stage 3D Detection with IoU-Awareness
Runzhou Ge, Zhuangzhuang Ding, Yihan Hu, Wenxin Shao, Li Huang, Kun, Li, Qiang Liu

TL;DR
This paper presents AFDetV2, a real-time, anchor-free 3D detection model that achieves high accuracy and low latency through architectural improvements and advanced training techniques, winning the CVPR 2021 challenge.
Contribution
Introduction of AFDetV2, a lightweight, IoU-aware, anchor-free 3D detection model with enhanced features and training strategies for real-time autonomous driving applications.
Findings
Achieved 73.12 mAPH/L2 accuracy at 60 ms latency.
Designed a lightweight 3D feature extractor and IoU-aware confidence scoring.
Won the CVPR 2021 Waymo Open Dataset Challenge.
Abstract
In this report, we introduce our winning solution to the Real-time 3D Detection and also the "Most Efficient Model" in the Waymo Open Dataset Challenges at CVPR 2021. Extended from our last year's award-winning model AFDet, we have made a handful of modifications to the base model, to improve the accuracy and at the same time to greatly reduce the latency. The modified model, named as AFDetV2, is featured with a lite 3D Feature Extractor, an improved RPN with extended receptive field and an added sub-head that produces an IoU-aware confidence score. These model enhancements, together with enriched data augmentation, stochastic weights averaging, and a GPU-based implementation of voxelization, lead to a winning accuracy of 73.12 mAPH/L2 for our AFDetV2 with a latency of 60.06 ms, and an accuracy of 72.57 mAPH/L2 for our AFDetV2-base, entitled as the "Most Efficient Model" by the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Advanced Image and Video Retrieval Techniques
MethodsRegion Proposal Network
