From Real to Synthetic and Back: Synthesizing Training Data for Multi-Person Scene Understanding
Igor Kviatkovsky, Nadav Bhonker, Gerard Medioni

TL;DR
This paper introduces a pipeline for generating realistic multi-person images from synthetic data, reducing the domain gap, and improving training for scene understanding tasks like UV mapping and depth estimation.
Contribution
We propose a novel synthetic data generation pipeline using a cGAN to translate segmentation maps into realistic images, enhancing multi-person scene understanding.
Findings
Generated data improves model performance on UV mapping and depth estimation.
The pipeline effectively reduces the synthetic-to-real domain gap.
Models trained on synthetic data achieve competitive results on real datasets.
Abstract
We present a method for synthesizing naturally looking images of multiple people interacting in a specific scenario. These images benefit from the advantages of synthetic data: being fully controllable and fully annotated with any type of standard or custom-defined ground truth. To reduce the synthetic-to-real domain gap, we introduce a pipeline consisting of the following steps: 1) we render scenes in a context modeled after the real world, 2) we train a human parsing model on the synthetic images, 3) we use the model to estimate segmentation maps for real images, 4) we train a conditional generative adversarial network (cGAN) to learn the inverse mapping -- from a segmentation map to a real image, and 5) given new synthetic segmentation maps, we use the cGAN to generate realistic images. An illustration of our pipeline is presented in Figure 2. We use the generated data to train a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging · Advanced Image Processing Techniques
