See More and Know More: Zero-shot Point Cloud Segmentation via   Multi-modal Visual Data

Yuhang Lu; Qi Jiang; Runnan Chen; Yuenan Hou; Xinge Zhu; Yuexin Ma

arXiv:2307.10782·cs.CV·July 21, 2023·2 cites

See More and Know More: Zero-shot Point Cloud Segmentation via Multi-modal Visual Data

Yuhang Lu, Qi Jiang, Runnan Chen, Yuenan Hou, Xinge Zhu, Yuexin Ma

PDF

Open Access

TL;DR

This paper introduces a multi-modal zero-shot point cloud segmentation approach that leverages both point clouds and images to improve recognition of unseen objects, significantly outperforming existing methods.

Contribution

It proposes a novel multi-modal learning framework combining point cloud and image data for zero-shot segmentation, addressing the limited information in point clouds alone.

Findings

01

Achieved 52% and 49% improvement in unseen class mIoU on SemanticKITTI and nuScenes.

02

Demonstrated superior performance over state-of-the-art zero-shot segmentation methods.

03

Validated effectiveness through extensive experiments on two popular benchmarks.

Abstract

Zero-shot point cloud segmentation aims to make deep models capable of recognizing novel objects in point cloud that are unseen in the training phase. Recent trends favor the pipeline which transfers knowledge from seen classes with labels to unseen classes without labels. They typically align visual features with semantic features obtained from word embedding by the supervision of seen classes' annotations. However, point cloud contains limited information to fully match with semantic features. In fact, the rich appearance information of images is a natural complement to the textureless point cloud, which is not well explored in previous literature. Motivated by this, we propose a novel multi-modal zero-shot learning method to better utilize the complementary information of point clouds and images for more accurate visual-semantic alignment. Extensive experiments are performed in two…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Surveying and Cultural Heritage · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning

MethodsALIGN