SegVec3D: A Method for Vector Embedding of 3D Objects Oriented Towards Robot manipulation

Zhihan Kang; Boyu Wang

arXiv:2507.09459·cs.CV·July 15, 2025

SegVec3D: A Method for Vector Embedding of 3D Objects Oriented Towards Robot manipulation

Zhihan Kang, Boyu Wang

PDF

Open Access

TL;DR

SegVec3D is a comprehensive framework that combines hierarchical feature extraction, contrastive clustering, and multimodal alignment to improve 3D object segmentation and zero-shot retrieval in robotics.

Contribution

It introduces a unified approach for 3D instance segmentation and multimodal understanding with minimal supervision, advancing the integration of geometric and semantic data.

Findings

01

Outperforms recent methods like Mask3D and ULIP in segmentation tasks.

02

Enables zero-shot retrieval of 3D objects using natural language queries.

03

Supports unsupervised instance segmentation through contrastive clustering.

Abstract

We propose SegVec3D, a novel framework for 3D point cloud instance segmentation that integrates attention mechanisms, embedding learning, and cross-modal alignment. The approach builds a hierarchical feature extractor to enhance geometric structure modeling and enables unsupervised instance segmentation via contrastive clustering. It further aligns 3D data with natural language queries in a shared semantic space, supporting zero-shot retrieval. Compared to recent methods like Mask3D and ULIP, our method uniquely unifies instance segmentation and multimodal understanding with minimal supervision and practical deployability.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsManufacturing Process and Optimization