TL;DR
SynFlow introduces a synthetic LiDAR scene flow dataset generated through a motion-focused simulation pipeline, enabling models to learn robust motion priors that generalize well to real-world data, even with minimal real labels.
Contribution
The paper presents SynFlow, a large-scale synthetic dataset for LiDAR scene flow, and demonstrates its effectiveness in training models that generalize across real-world benchmarks.
Findings
Models trained on SynFlow-4k generalize well to real data in zero-shot settings.
Fine-tuning with 5% of real labels on SynFlow-4k surpasses models trained on full real data.
SynFlow-4k achieves a 34x scale-up over existing real-world benchmarks.
Abstract
Reliable 3D dynamic perception requires models that can anticipate motion beyond predefined categories, yet progress is hindered by the scarcity of dense, high-quality motion annotations. While self-supervision on unlabeled real data offers a path forward, empirical evidence suggests that scaling unlabeled data fails to close the performance gap due to noisy proxy signals. In this paper, we propose a shift in paradigm: learning robust real-world motion priors entirely from scalable simulation. We introduce SynFlow, a data generation pipeline that generates large-scale synthetic dataset specifically designed for LiDAR scene flow. Unlike prior works that prioritize sensor-specific realism, SynFlow employs a motion-oriented strategy to synthesize diverse kinematic patterns across 4,000 sequences (940k frames), termed SynFlow-4k. This represents a 34x scale-up in annotated volume over…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
