Parts2Words: Learning Joint Embedding of Point Clouds and Texts by Bidirectional Matching between Parts and Words
Chuan Tang, Xi Yang, Bojian Wu, Zhizhong Han, Yi Chang

TL;DR
This paper introduces a novel method for shape-text matching that directly uses point clouds and bidirectional part-word matching, improving accuracy in multi-modal retrieval tasks.
Contribution
It proposes a joint embedding approach with part segmentation and optimal transport for shape-text matching, addressing limitations of view-based methods.
Findings
Significant accuracy improvement over state-of-the-art methods.
Effective joint embedding of point clouds and texts.
Robust shape-text matching demonstrated on Text2Shape dataset.
Abstract
Shape-Text matching is an important task of high-level shape understanding. Current methods mainly represent a 3D shape as multiple 2D rendered views, which obviously can not be understood well due to the structural ambiguity caused by self-occlusion in the limited number of views. To resolve this issue, we directly represent 3D shapes as point clouds, and propose to learn joint embedding of point clouds and texts by bidirectional matching between parts from shapes and words from texts. Specifically, we first segment the point clouds into parts, and then leverage optimal transport method to match parts and words in an optimized feature space, where each part is represented by aggregating features of all points within it and each word is abstracted by its contextual information. We optimize the feature space in order to enlarge the similarities between the paired training samples, while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques · Handwritten Text Recognition Techniques
