SAIL-VOS 3D: A Synthetic Dataset and Baselines for Object Detection and   3D Mesh Reconstruction from Video Data

Yuan-Ting Hu; Jiahong Wang; Raymond A. Yeh; Alexander G. Schwing

arXiv:2105.08612·cs.CV·May 19, 2021·1 cites

SAIL-VOS 3D: A Synthetic Dataset and Baselines for Object Detection and 3D Mesh Reconstruction from Video Data

Yuan-Ting Hu, Jiahong Wang, Raymond A. Yeh, Alexander G. Schwing

PDF

Open Access

TL;DR

This paper introduces SAIL-VOS 3D, a synthetic video dataset with mesh annotations and baseline models for 3D object reconstruction from video, demonstrating that temporal data enhances reconstruction accuracy.

Contribution

The paper presents the first synthetic video dataset with mesh annotations for 3D reconstruction and develops baseline temporal models to improve mesh reconstruction from videos.

Findings

01

Temporal models outperform single-image methods.

02

Using video data improves 3D mesh reconstruction quality.

03

SAIL-VOS 3D dataset enables studying temporal effects in 3D reconstruction.

Abstract

Extracting detailed 3D information of objects from video data is an important goal for holistic scene understanding. While recent methods have shown impressive results when reconstructing meshes of objects from a single image, results often remain ambiguous as part of the object is unobserved. Moreover, existing image-based datasets for mesh reconstruction don't permit to study models which integrate temporal information. To alleviate both concerns we present SAIL-VOS 3D: a synthetic video dataset with frame-by-frame mesh annotations which extends SAIL-VOS. We also develop first baselines for reconstruction of 3D meshes from video data via temporal models. We demonstrate efficacy of the proposed baseline on SAIL-VOS 3D and Pix3D, showing that temporal information improves reconstruction quality. Resources and additional information are available at http://sailvos.web.illinois.edu.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Advanced Vision and Imaging