PseudoAugment: Learning to Use Unlabeled Data for Data Augmentation in Point Clouds
Zhaoqi Leng, Shuyang Cheng, Benjamin Caine, Weiyue Wang, Xiao Zhang,, Jonathon Shlens, Mingxing Tan, Dragomir Anguelov

TL;DR
This paper introduces PseudoAugment, a novel approach that leverages unlabeled data for data augmentation in 3D point cloud detection, significantly improving data efficiency and model performance.
Contribution
It proposes three pseudo-label based augmentation policies and a unified framework for hyperparameter tuning, enhancing data diversity and reducing computational costs.
Findings
Outperforms state-of-the-art augmentation methods on Waymo dataset
Achieves 3X data efficiency on vehicle detection tasks
Nearly matches full dataset training with only 10% labeled data
Abstract
Data augmentation is an important technique to improve data efficiency and save labeling cost for 3D detection in point clouds. Yet, existing augmentation policies have so far been designed to only utilize labeled data, which limits the data diversity. In this paper, we recognize that pseudo labeling and data augmentation are complementary, thus propose to leverage unlabeled data for data augmentation to enrich the training data. In particular, we design three novel pseudo-label based data augmentation policies (PseudoAugments) to fuse both labeled and pseudo-labeled scenes, including frames (PseudoFrame), objecta (PseudoBBox), and background (PseudoBackground). PseudoAugments outperforms pseudo labeling by mitigating pseudo labeling errors and generating diverse fused training scenes. We demonstrate PseudoAugments generalize across point-based and voxel-based architectures, different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
