MVImgNet2.0: A Larger-scale Dataset of Multi-view Images

Xiaoguang Han; Yushuang Wu; Luyue Shi; Haolin Liu; Hongjie Liao,; Lingteng Qiu; Weihao Yuan; Xiaodong Gu; Zilong Dong; Shuguang Cui

arXiv:2412.01430·cs.CV·December 3, 2024

MVImgNet2.0: A Larger-scale Dataset of Multi-view Images

Xiaoguang Han, Yushuang Wu, Luyue Shi, Haolin Liu, Hongjie Liao,, Lingteng Qiu, Weihao Yuan, Xiaodong Gu, Zilong Dong, Shuguang Cui

PDF

TL;DR

MVImgNet2.0 is a significantly expanded and improved multi-view image dataset with 520k objects, supporting advanced 3D reconstruction and vision tasks through high-quality multi-view captures, segmentation, and pose estimation.

Contribution

The paper introduces MVImgNet2.0, a larger, higher-quality multi-view dataset with 360-degree views, improved segmentation, and better pose estimation, bridging 2D and 3D vision research.

Findings

01

Enhances 3D reconstruction performance with the new dataset.

02

Provides high-quality multi-view images and point clouds for research.

03

Supports diverse downstream 3D vision applications.

Abstract

MVImgNet is a large-scale dataset that contains multi-view images of ~220k real-world objects in 238 classes. As a counterpart of ImageNet, it introduces 3D visual signals via multi-view shooting, making a soft bridge between 2D and 3D vision. This paper constructs the MVImgNet2.0 dataset that expands MVImgNet into a total of ~520k objects and 515 categories, which derives a 3D dataset with a larger scale that is more comparable to ones in the 2D domain. In addition to the expanded dataset scale and category range, MVImgNet2.0 is of a higher quality than MVImgNet owing to four new features: (i) most shoots capture 360-degree views of the objects, which can support the learning of object reconstruction with completeness; (ii) the segmentation manner is advanced to produce foreground object masks of higher accuracy; (iii) a more powerful structure-from-motion method is adopted to derive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.