MVImgNet2.0: A Larger-scale Dataset of Multi-view Images
Xiaoguang Han, Yushuang Wu, Luyue Shi, Haolin Liu, Hongjie Liao,, Lingteng Qiu, Weihao Yuan, Xiaodong Gu, Zilong Dong, Shuguang Cui

TL;DR
MVImgNet2.0 is a significantly expanded and improved multi-view image dataset with 520k objects, supporting advanced 3D reconstruction and vision tasks through high-quality multi-view captures, segmentation, and pose estimation.
Contribution
The paper introduces MVImgNet2.0, a larger, higher-quality multi-view dataset with 360-degree views, improved segmentation, and better pose estimation, bridging 2D and 3D vision research.
Findings
Enhances 3D reconstruction performance with the new dataset.
Provides high-quality multi-view images and point clouds for research.
Supports diverse downstream 3D vision applications.
Abstract
MVImgNet is a large-scale dataset that contains multi-view images of ~220k real-world objects in 238 classes. As a counterpart of ImageNet, it introduces 3D visual signals via multi-view shooting, making a soft bridge between 2D and 3D vision. This paper constructs the MVImgNet2.0 dataset that expands MVImgNet into a total of ~520k objects and 515 categories, which derives a 3D dataset with a larger scale that is more comparable to ones in the 2D domain. In addition to the expanded dataset scale and category range, MVImgNet2.0 is of a higher quality than MVImgNet owing to four new features: (i) most shoots capture 360-degree views of the objects, which can support the learning of object reconstruction with completeness; (ii) the segmentation manner is advanced to produce foreground object masks of higher accuracy; (iii) a more powerful structure-from-motion method is adopted to derive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
