SUIT: Learning Significance-guided Information for 3D Temporal Detection
Zheyuan Zhou, Jiachen Lu, Yihan Zeng, Hang Xu, Li Zhang

TL;DR
SUIT introduces a sparse, significance-guided approach for 3D temporal detection that efficiently fuses information across frames, reducing computational costs while maintaining high detection performance.
Contribution
The paper proposes a novel sparse feature-based temporal fusion method with significance-guided sampling and geometric transformation learning for 3D detection.
Findings
Reduces memory and computation costs significantly.
Achieves competitive or superior detection performance.
Effective in large-scale autonomous driving datasets.
Abstract
3D object detection from LiDAR point cloud is of critical importance for autonomous driving and robotics. While sequential point cloud has the potential to enhance 3D perception through temporal information, utilizing these temporal features effectively and efficiently remains a challenging problem. Based on the observation that the foreground information is sparsely distributed in LiDAR scenes, we believe sufficient knowledge can be provided by sparse format rather than dense maps. To this end, we propose to learn Significance-gUided Information for 3D Temporal detection (SUIT), which simplifies temporal information as sparse features for information fusion across frames. Specifically, we first introduce a significant sampling mechanism that extracts information-rich yet sparse features based on predicted object centroids. On top of that, we present an explicit geometric transformation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Advanced Neural Network Applications · Remote Sensing and LiDAR Applications
