3D Part Segmentation via Geometric Aggregation of 2D Visual Features

Marco Garosi; Riccardo Tedoldi; Davide Boscaini; Massimiliano Mancini,; Nicu Sebe; Fabio Poiesi

arXiv:2412.04247·cs.CV·January 9, 2025

3D Part Segmentation via Geometric Aggregation of 2D Visual Features

Marco Garosi, Riccardo Tedoldi, Davide Boscaini, Massimiliano Mancini,, Nicu Sebe, Fabio Poiesi

PDF

Open Access 1 Repo

TL;DR

COPS is a novel 3D part segmentation method that combines visual features from multiple viewpoints with 3D geometric information, enabling accurate zero-shot segmentation across diverse datasets without extensive prompt engineering.

Contribution

It introduces a geometric-aware feature aggregation technique that leverages multi-view visual features and 3D geometry for effective zero-shot 3D part segmentation.

Findings

01

Achieves state-of-the-art zero-shot performance on five datasets.

02

Effective across synthetic, real-world, textured, and non-textured objects.

03

Scalable and efficient for diverse 3D shapes.

Abstract

Supervised 3D part segmentation models are tailored for a fixed set of objects and parts, limiting their transferability to open-set, real-world scenarios. Recent works have explored vision-language models (VLMs) as a promising alternative, using multi-view rendering and textual prompting to identify object parts. However, naively applying VLMs in this context introduces several drawbacks, such as the need for meticulous prompt engineering, and fails to leverage the 3D geometric structure of objects. To address these limitations, we propose COPS, a COmprehensive model for Parts Segmentation that blends the semantics extracted from visual concepts and 3D geometry to effectively identify object parts. COPS renders a point cloud from multiple viewpoints, extracts 2D features, projects them back to 3D, and uses a novel geometric-aware feature aggregation procedure to ensure spatial and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

marco-garosi/COPS
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Processing and 3D Reconstruction · Industrial Vision Systems and Defect Detection · Image and Object Detection Techniques

MethodsSparse Evolutionary Training