Synthetic Data for Deep Learning
Sergey I. Nikolenko

TL;DR
This paper provides a comprehensive survey of synthetic data in deep learning, covering its development, applications across various domains, domain adaptation challenges, and privacy considerations, highlighting future research directions.
Contribution
It offers an extensive overview of synthetic data methods, applications, and challenges, including domain adaptation and privacy, which was lacking in prior focused studies.
Findings
Synthetic data improves training in diverse computer vision tasks.
GANs are increasingly used for synthetic data generation and refinement.
Synthetic-to-real domain adaptation remains a key challenge.
Abstract
Synthetic data is an increasingly popular tool for training deep learning models, especially in computer vision but also in other areas. In this work, we attempt to provide a comprehensive survey of the various directions in the development and application of synthetic data. First, we discuss synthetic datasets for basic computer vision problems, both low-level (e.g., optical flow estimation) and high-level (e.g., semantic segmentation), synthetic environments and datasets for outdoor and urban scenes (autonomous driving), indoor scenes (indoor navigation), aerial navigation, simulation environments for robotics, applications of synthetic data outside computer vision (in neural programming, bioinformatics, NLP, and more); we also survey the work on improving synthetic data development and alternative ways to produce it such as GANs. Second, we discuss in detail the synthetic-to-real…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Time Series Analysis and Forecasting
