TL;DR
This paper demonstrates that synthetic, photo-realistic virtual world data can replace human annotations for training deep learning models in vehicle detection, outperforming models trained on real-world annotations.
Contribution
The authors introduce a method to use synthetic images from simulation engines for training deep learning models, reducing reliance on costly human annotations.
Findings
Synthetic data trained models outperform real data trained models on KITTI dataset.
Using virtual worlds accelerates data collection and annotation process.
Open-source code and data are provided for further research.
Abstract
Deep learning has rapidly transformed the state of the art algorithms used to address a variety of problems in computer vision and robotics. These breakthroughs have relied upon massive amounts of human annotated training data. This time consuming process has begun impeding the progress of these deep learning efforts. This paper describes a method to incorporate photo-realistic computer images from a simulation engine to rapidly generate annotated data that can be used for the training of machine learning algorithms. We demonstrate that a state of the art architecture, which is trained only using these synthetic annotations, performs better than the identical architecture trained on human annotated real-world data, when tested on the KITTI data set for vehicle detection. By training machine learning algorithms on a rich virtual world, real objects in real scenes can be learned and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
