SegVec3D: A Method for Vector Embedding of 3D Objects Oriented Towards Robot manipulation
Zhihan Kang, Boyu Wang

TL;DR
SegVec3D is a comprehensive framework that combines hierarchical feature extraction, contrastive clustering, and multimodal alignment to improve 3D object segmentation and zero-shot retrieval in robotics.
Contribution
It introduces a unified approach for 3D instance segmentation and multimodal understanding with minimal supervision, advancing the integration of geometric and semantic data.
Findings
Outperforms recent methods like Mask3D and ULIP in segmentation tasks.
Enables zero-shot retrieval of 3D objects using natural language queries.
Supports unsupervised instance segmentation through contrastive clustering.
Abstract
We propose SegVec3D, a novel framework for 3D point cloud instance segmentation that integrates attention mechanisms, embedding learning, and cross-modal alignment. The approach builds a hierarchical feature extractor to enhance geometric structure modeling and enables unsupervised instance segmentation via contrastive clustering. It further aligns 3D data with natural language queries in a shared semantic space, supporting zero-shot retrieval. Compared to recent methods like Mask3D and ULIP, our method uniquely unifies instance segmentation and multimodal understanding with minimal supervision and practical deployability.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsManufacturing Process and Optimization
