3D-AVS: LiDAR-based 3D Auto-Vocabulary Segmentation

Weijie Wei; Osman \"Ulger; Fatemeh Karimi Nejadasl; Theo Gevers,; Martin R. Oswald

arXiv:2406.09126·cs.CV·April 1, 2025

3D-AVS: LiDAR-based 3D Auto-Vocabulary Segmentation

Weijie Wei, Osman \"Ulger, Fatemeh Karimi Nejadasl, Theo Gevers,, Martin R. Oswald

PDF

Open Access 1 Repo

TL;DR

3D-AVS introduces an auto-vocabulary segmentation method for 3D point clouds that generates semantic categories at runtime, eliminating the need for human-provided labels and enabling richer, scalable annotations.

Contribution

It proposes a novel auto-vocabulary segmentation approach for 3D point clouds that combines image and LiDAR data, with a new metric for evaluating unknown vocabularies.

Findings

01

Effective segmentation on nuScenes and ScanNet200 datasets.

02

Generates accurate semantic classes without human labels.

03

Enhances robustness under challenging lighting conditions.

Abstract

Open-Vocabulary Segmentation (OVS) methods offer promising capabilities in detecting unseen object categories, but the category must be known and needs to be provided by a human, either via a text prompt or pre-labeled datasets, thus limiting their scalability. We propose 3D-AVS, a method for Auto-Vocabulary Segmentation of 3D point clouds for which the vocabulary is unknown and auto-generated for each input at runtime, thus eliminating the human in the loop and typically providing a substantially larger vocabulary for richer annotations. 3D-AVS first recognizes semantic entities from image or point cloud data and then segments all points with the automatically generated vocabulary. Our method incorporates both image-based and point-based recognition, enhancing robustness under challenging lighting conditions where geometric information from LiDAR is especially valuable. Our point-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ozzyou/3d-avs
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Surveying and Cultural Heritage · Advanced Image and Video Retrieval Techniques · Image Processing and 3D Reconstruction

MethodsAttention Pooling