Falling Things: A Synthetic Dataset for 3D Object Detection and Pose   Estimation

Jonathan Tremblay; Thang To; Stan Birchfield

arXiv:1804.06534·cs.CV·July 12, 2018

Falling Things: A Synthetic Dataset for 3D Object Detection and Pose Estimation

Jonathan Tremblay, Thang To, Stan Birchfield

PDF

TL;DR

The Falling Things (FAT) dataset offers a large collection of photorealistic synthetic images with precise 3D annotations, designed to improve object detection and pose estimation in robotics.

Contribution

This paper introduces the FAT dataset, combining high-quality synthetic images with comprehensive annotations to advance 3D object detection and pose estimation research.

Findings

01

Provides 60k annotated images of household objects

02

Includes accurate 3D poses, segmentation, and bounding boxes

03

Offers mono and stereo RGB plus depth data

Abstract

We present a new dataset, called Falling Things (FAT), for advancing the state-of-the-art in object detection and 3D pose estimation in the context of robotics. By synthetically combining object models and backgrounds of complex composition and high graphical quality, we are able to generate photorealistic images with accurate 3D pose annotations for all objects in all images. Our dataset contains 60k annotated photos of 21 household objects taken from the YCB dataset. For each image, we provide the 3D poses, per-pixel class segmentation, and 2D/3D bounding box coordinates for all objects. To facilitate testing different input modalities, we provide mono and stereo RGB images, along with registered dense depth images. We describe in detail the generation process and statistical analysis of the data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.