Find Any Part in 3D

Ziqi Ma; Yisong Yue; Georgia Gkioxari

arXiv:2411.13550·cs.CV·March 31, 2025

Find Any Part in 3D

Ziqi Ma, Yisong Yue, Georgia Gkioxari

PDF

Open Access 1 Repo 1 Models 1 Datasets

TL;DR

This paper introduces a data engine powered by 2D foundation models to automatically annotate extensive 3D object part data, enabling a generalizable open-world 3D part segmentation model that outperforms existing methods.

Contribution

It presents a novel data engine that significantly expands 3D part datasets and trains a model capable of zero-shot generalization to any part in any object.

Findings

01

260% improvement in mIoU over existing methods

02

6x to 300x speed boost

03

Annotated 1755x more unique part types than previous datasets

Abstract

Why don't we have foundation models in 3D yet? A key limitation is data scarcity. For 3D object part segmentation, existing datasets are small in size and lack diversity. We show that it is possible to break this data barrier by building a data engine powered by 2D foundation models. Our data engine automatically annotates any number of object parts: 1755x more unique part types than existing datasets combined. By training on our annotated data with a simple contrastive objective, we obtain an open-world model that generalizes to any part in any object based on any text query. Even when evaluated zero-shot, we outperform existing methods on the datasets they train on. We achieve 260% improvement in mIoU and boost speed by 6x to 300x. Our scaling analysis confirms that this generalization stems from the data scale, which underscores the impact of our data engine. Finally, to advance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ziqi-ma/find3d
pytorch

Models

🤗
ziqima/find3d-checkpt0
model· 407 dl· ♡ 1
407 dl♡ 1

Datasets

ziqima/Objaverse-General-Find3D
dataset· 34 dl
34 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Surveying and Cultural Heritage · Manufacturing Process and Optimization · Image Processing and 3D Reconstruction

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings