Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding
Mike Roberts, Jason Ramapuram, Anurag Ranjan, Atulit Kumar, Miguel, Angel Bautista, Nathan Paczan, Russ Webb, Joshua M. Susskind

TL;DR
Hypersim is a large, photorealistic synthetic indoor scene dataset with detailed annotations, designed to improve scene understanding tasks and enable cost-effective training and evaluation.
Contribution
The paper introduces Hypersim, a comprehensive synthetic dataset with detailed scene annotations created from publicly available assets, enabling cost-effective training and analysis.
Findings
Pre-training on Hypersim improves real-world scene understanding performance.
Hypersim achieves state-of-the-art results on Pix3D.
Dataset generation costs are roughly half of training a large NLP model.
Abstract
For many fundamental scene understanding tasks, it is difficult or impossible to obtain per-pixel ground truth labels from real images. We address this challenge by introducing Hypersim, a photorealistic synthetic dataset for holistic indoor scene understanding. To create our dataset, we leverage a large repository of synthetic scenes created by professional artists, and we generate 77,400 images of 461 indoor scenes with detailed per-pixel labels and corresponding ground truth geometry. Our dataset: (1) relies exclusively on publicly available 3D assets; (2) includes complete scene geometry, material information, and lighting information for every scene; (3) includes dense per-pixel semantic instance segmentations and complete camera information for every image; and (4) factors every image into diffuse reflectance, diffuse illumination, and a non-diffuse residual term that captures…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · 3D Surveying and Cultural Heritage · Advanced Image and Video Retrieval Techniques
