TL;DR
This paper introduces a transferable, efficient LiDAR world model using latent flow matching, achieving state-of-the-art results in semantic occupancy forecasting with reduced data and computational costs.
Contribution
We develop a latent flow matching framework for LiDAR world modeling that enhances transferability, efficiency, and accuracy across multiple domains and tasks.
Findings
Up to 11% absolute improvement over training from scratch.
Achieves state-of-the-art semantic occupancy forecasting with only 5% of previous training data.
23x faster inference speed compared to prior models.
Abstract
LiDAR-based world models offer more structured and geometry-aware representations than their image-based counterparts. However, existing LiDAR world models are narrowly trained; each model excels only in the domain for which it was built. Can we develop LiDAR world models that exhibit strong transferability across multiple domains? We conduct the first systematic domain transfer study across three demanding scenarios: (i) outdoor to indoor generalization, (ii) sparse-beam & dense-beam adaptation, and (iii) non-semantic to semantic transfer. Given different amounts of fine-tuning data, our experiments show that a single pre-trained model can achieve up to 11% absolute improvement (83% relative) over training from scratch and outperforms training from scratch in 30/36 of our comparisons. This transferability of dynamic learning significantly reduces the reliance on manually annotated data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
