Configurable 3D Scene Synthesis and 2D Image Rendering with Per-Pixel   Ground Truth using Stochastic Grammars

Chenfanfu Jiang; Siyuan Qi; Yixin Zhu; Siyuan Huang; Jenny Lin,; Lap-Fai Yu; Demetri Terzopoulos; Song-Chun Zhu

arXiv:1704.00112·cs.CV·June 21, 2018

Configurable 3D Scene Synthesis and 2D Image Rendering with Per-Pixel Ground Truth using Stochastic Grammars

Chenfanfu Jiang, Siyuan Qi, Yixin Zhu, Siyuan Huang, Jenny Lin,, Lap-Fai Yu, Demetri Terzopoulos, Song-Chun Zhu

PDF

TL;DR

This paper introduces a learning-based system that generates diverse, photorealistic 3D indoor scenes and 2D images with detailed ground truth, aiding training and evaluation of computer vision models.

Contribution

It presents a novel pipeline using stochastic grammars and physics-based rendering to produce customizable, high-quality synthetic datasets with per-pixel ground truth for scene understanding tasks.

Findings

01

Enhanced depth and normal prediction accuracy

02

Improved semantic segmentation performance

03

Provided controllable benchmarks for model diagnostics

Abstract

We propose a systematic learning-based approach to the generation of massive quantities of synthetic 3D scenes and arbitrary numbers of photorealistic 2D images thereof, with associated ground truth information, for the purposes of training, benchmarking, and diagnosing learning-based computer vision and robotics algorithms. In particular, we devise a learning-based pipeline of algorithms capable of automatically generating and rendering a potentially infinite variety of indoor scenes by using a stochastic grammar, represented as an attributed Spatial And-Or Graph, in conjunction with state-of-the-art physics-based rendering. Our pipeline is capable of synthesizing scene layouts with high diversity, and it is configurable inasmuch as it enables the precise customization and control of important attributes of the generated scenes. It renders photorealistic RGB images of the generated…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

See pages 1-last of scenesynthesis2018ijcv.pdf