Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene   Understanding

Mike Roberts; Jason Ramapuram; Anurag Ranjan; Atulit Kumar; Miguel; Angel Bautista; Nathan Paczan; Russ Webb; Joshua M. Susskind

arXiv:2011.02523·cs.CV·August 19, 2021

Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding

Mike Roberts, Jason Ramapuram, Anurag Ranjan, Atulit Kumar, Miguel, Angel Bautista, Nathan Paczan, Russ Webb, Joshua M. Susskind

PDF

Open Access 2 Repos

TL;DR

Hypersim is a large, photorealistic synthetic indoor scene dataset with detailed annotations, designed to improve scene understanding tasks and enable cost-effective training and evaluation.

Contribution

The paper introduces Hypersim, a comprehensive synthetic dataset with detailed scene annotations created from publicly available assets, enabling cost-effective training and analysis.

Findings

01

Pre-training on Hypersim improves real-world scene understanding performance.

02

Hypersim achieves state-of-the-art results on Pix3D.

03

Dataset generation costs are roughly half of training a large NLP model.

Abstract

For many fundamental scene understanding tasks, it is difficult or impossible to obtain per-pixel ground truth labels from real images. We address this challenge by introducing Hypersim, a photorealistic synthetic dataset for holistic indoor scene understanding. To create our dataset, we leverage a large repository of synthetic scenes created by professional artists, and we generate 77,400 images of 461 indoor scenes with detailed per-pixel labels and corresponding ground truth geometry. Our dataset: (1) relies exclusively on publicly available 3D assets; (2) includes complete scene geometry, material information, and lighting information for every scene; (3) includes dense per-pixel semantic instance segmentations and complete camera information for every image; and (4) factors every image into diffuse reflectance, diffuse illumination, and a non-diffuse residual term that captures…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · 3D Surveying and Cultural Heritage · Advanced Image and Video Retrieval Techniques