Find Any Part in 3D
Ziqi Ma, Yisong Yue, Georgia Gkioxari

TL;DR
This paper introduces a data engine powered by 2D foundation models to automatically annotate extensive 3D object part data, enabling a generalizable open-world 3D part segmentation model that outperforms existing methods.
Contribution
It presents a novel data engine that significantly expands 3D part datasets and trains a model capable of zero-shot generalization to any part in any object.
Findings
260% improvement in mIoU over existing methods
6x to 300x speed boost
Annotated 1755x more unique part types than previous datasets
Abstract
Why don't we have foundation models in 3D yet? A key limitation is data scarcity. For 3D object part segmentation, existing datasets are small in size and lack diversity. We show that it is possible to break this data barrier by building a data engine powered by 2D foundation models. Our data engine automatically annotates any number of object parts: 1755x more unique part types than existing datasets combined. By training on our annotated data with a simple contrastive objective, we obtain an open-world model that generalizes to any part in any object based on any text query. Even when evaluated zero-shot, we outperform existing methods on the datasets they train on. We achieve 260% improvement in mIoU and boost speed by 6x to 300x. Our scaling analysis confirms that this generalization stems from the data scale, which underscores the impact of our data engine. Finally, to advance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Surveying and Cultural Heritage · Manufacturing Process and Optimization · Image Processing and 3D Reconstruction
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
