TREND: Unsupervised 3D Representation Learning via Temporal Forecasting for LiDAR Perception
Runjian Chen, Hyoungseob Park, Bo Zhang, Wenqi Shao, Ping Luo, Alex Wong

TL;DR
TREND introduces a novel unsupervised 3D representation learning method for LiDAR data by forecasting future observations using temporal information, significantly improving downstream 3D detection performance.
Contribution
It is the first to leverage temporal forecasting with a recurrent embedding and neural field for unsupervised 3D representation learning in LiDAR perception.
Findings
Up to 90% improvement over previous SOTA methods
Effective across multiple datasets and models
Demonstrates the benefit of temporal forecasting in LiDAR perception
Abstract
Labeling LiDAR point clouds is notoriously time-and-energy-consuming, which spurs recent unsupervised 3D representation learning methods to alleviate the labeling burden in LiDAR perception via pretrained weights. Almost all existing work focus on a single frame of LiDAR point cloud and neglect the temporal LiDAR sequence, which naturally accounts for object motion (and their semantics). Instead, we propose TREND, namely Temporal REndering with Neural fielD, to learn 3D representation via forecasting the future observation in an unsupervised manner. Unlike existing work that follows conventional contrastive learning or masked auto encoding paradigms, TREND integrates forecasting for 3D pre-training through a Recurrent Embedding scheme to generate 3D embedding across time and a Temporal Neural Field to represent the 3D scene, through which we compute the loss using differentiable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsImage Processing and 3D Reconstruction · Advanced Neural Network Applications · 3D Shape Modeling and Analysis
MethodsContrastive Learning · Focus
