MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based Self-Supervised Pre-Training
Runsen Xu, Tai Wang, Wenwei Zhang, Runjian Chen, Jinkun Cao, Jiangmiao, Pang, Dahua Lin

TL;DR
This paper presents MV-JAR, a novel self-supervised pre-training method for LiDAR-based 3D object detection that improves performance by modeling voxel and point distributions, and introduces a new data-efficient benchmark.
Contribution
MV-JAR introduces masking and reconstruction strategies tailored for LiDAR data and proposes a new benchmark for more accurate evaluation of data-efficient 3D detection methods.
Findings
MV-JAR achieves up to 6.3% improvement in mAPH over training from scratch.
The new benchmark ensures diverse fine-tuning splits for better evaluation.
Experiments on Waymo and KITTI datasets validate the effectiveness of MV-JAR.
Abstract
This paper introduces the Masked Voxel Jigsaw and Reconstruction (MV-JAR) method for LiDAR-based self-supervised pre-training and a carefully designed data-efficient 3D object detection benchmark on the Waymo dataset. Inspired by the scene-voxel-point hierarchy in downstream 3D object detectors, we design masking and reconstruction strategies accounting for voxel distributions in the scene and local point distributions within the voxel. We employ a Reversed-Furthest-Voxel-Sampling strategy to address the uneven distribution of LiDAR points and propose MV-JAR, which combines two techniques for modeling the aforementioned distributions, resulting in superior performance. Our experiments reveal limitations in previous data-efficient experiments, which uniformly sample fine-tuning splits with varying data proportions from each LiDAR sequence, leading to similar data diversity across splits.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · 3D Surveying and Cultural Heritage · Image and Object Detection Techniques
MethodsJigsaw
