Rethinking IoU-based Optimization for Single-stage 3D Object Detection
Hualian Sheng, Sijia Cai, Na Zhao, Bing Deng, Jianqiang Huang,, Xian-Sheng Hua, Min-Jian Zhao, Gim Hee Lee

TL;DR
This paper introduces RDIoU, a novel rotation-decoupled IoU metric that improves the training efficiency and accuracy of single-stage 3D object detectors by addressing rotation sensitivity issues.
Contribution
It proposes RDIoU, a new IoU-based optimization method that decouples rotation, enhancing training stability and detection performance in 3D object detection.
Findings
RDIoU improves detection accuracy on KITTI and Waymo datasets.
It reduces training instability caused by rotation sensitivity.
The method enhances bounding box regression precision.
Abstract
Since Intersection-over-Union (IoU) based optimization maintains the consistency of the final IoU prediction metric and losses, it has been widely used in both regression and classification branches of single-stage 2D object detectors. Recently, several 3D object detection methods adopt IoU-based optimization and directly replace the 2D IoU with 3D IoU. However, such a direct computation in 3D is very costly due to the complex implementation and inefficient backward operations. Moreover, 3D IoU-based optimization is sub-optimal as it is sensitive to rotation and thus can cause training instability and detection performance deterioration. In this paper, we propose a novel Rotation-Decoupled IoU (RDIoU) method that can mitigate the rotation-sensitivity issue, and produce more efficient optimization objectives compared with 3D IoU during the training stage. Specifically, our RDIoU…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Human Pose and Action Recognition
