MVImgNet: A Large-scale Dataset of Multi-view Images
Xianggang Yu, Mutian Xu, Yidan Zhang, Haolin Liu, Chongjie Ye,, Yushuang Wu, Zizheng Yan, Chenming Zhu, Zhangyang Xiong, Tianyou Liang,, Guanying Chen, Shuguang Cui, Xiaoguang Han

TL;DR
MVImgNet is a large-scale multi-view image dataset designed to bridge 2D and 3D vision, enabling advancements in 3D understanding and related tasks through rich annotations and multi-view data.
Contribution
The paper introduces MVImgNet, a comprehensive multi-view image dataset with annotations, and derives MVPNet, a 3D point cloud dataset, to facilitate research in 3D vision and multi-view learning.
Findings
MVImgNet demonstrates promising results in 3D and 2D visual tasks.
MVPNet benefits 3D object classification tasks.
The datasets enable new research directions in 3D vision.
Abstract
Being data-driven is one of the most iconic properties of deep learning algorithms. The birth of ImageNet drives a remarkable trend of "learning from large-scale data" in computer vision. Pretraining on ImageNet to obtain rich universal representations has been manifested to benefit various 2D visual tasks, and becomes a standard in 2D vision. However, due to the laborious collection of real-world 3D data, there is yet no generic dataset serving as a counterpart of ImageNet in 3D vision, thus how such a dataset can impact the 3D community is unraveled. To remedy this defect, we introduce MVImgNet, a large-scale dataset of multi-view images, which is highly convenient to gain by shooting videos of real-world objects in human daily life. It contains 6.5 million frames from 219,188 videos crossing objects from 238 classes, with rich annotations of object masks, camera parameters, and point…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Remote Sensing and LiDAR Applications · 3D Surveying and Cultural Heritage
