SceneNet RGB-D: 5M Photorealistic Images of Synthetic Indoor Trajectories with Ground Truth
John McCormac, Ankur Handa, Stefan Leutenegger, Andrew J. Davison

TL;DR
SceneNet RGB-D is a large-scale synthetic dataset of 5 million photorealistic indoor RGB-D images with ground truth, designed to advance scene understanding and geometric vision research.
Contribution
It introduces a vast, diverse synthetic dataset with accurate ground truth for training and evaluating indoor scene understanding and geometric computer vision algorithms.
Findings
Provides a large-scale dataset suitable for pre-training deep models.
Enables research on 3D scene labeling with perfect camera and depth data.
Facilitates development of data-driven methods for indoor scene understanding.
Abstract
We introduce SceneNet RGB-D, expanding the previous work of SceneNet to enable large scale photorealistic rendering of indoor scene trajectories. It provides pixel-perfect ground truth for scene understanding problems such as semantic segmentation, instance segmentation, and object detection, and also for geometric computer vision problems such as optical flow, depth estimation, camera pose estimation, and 3D reconstruction. Random sampling permits virtually unlimited scene configurations, and here we provide a set of 5M rendered RGB-D images from over 15K trajectories in synthetic layouts with random but physically simulated object poses. Each layout also has random lighting, camera trajectories, and textures. The scale of this dataset is well suited for pre-training data-driven computer vision techniques from scratch with RGB-D inputs, which previously has been limited by relatively…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · 3D Surveying and Cultural Heritage
