PartGlot: Learning Shape Part Segmentation from Language Reference Games

Juil Koo; Ian Huang; Panos Achlioptas; Leonidas Guibas; Minhyuk Sung

arXiv:2112.06390·cs.CV·March 31, 2022·1 cites

PartGlot: Learning Shape Part Segmentation from Language Reference Games

Juil Koo, Ian Huang, Panos Achlioptas, Leonidas Guibas, Minhyuk Sung

PDF

Open Access 2 Repos

TL;DR

PartGlot is a neural framework that learns 3D shape part segmentation using only language descriptions and reference games, eliminating the need for explicit geometric annotations.

Contribution

It introduces a language-based learning approach for 3D shape segmentation that generalizes to unseen classes without direct geometric supervision.

Findings

01

The model accurately segments shape parts based on language references.

02

It generalizes to new shape classes not seen during training.

03

The approach reduces the need for large annotated datasets.

Abstract

We introduce PartGlot, a neural framework and associated architectures for learning semantic part segmentation of 3D shape geometry, based solely on part referential language. We exploit the fact that linguistic descriptions of a shape can provide priors on the shape's parts -- as natural language has evolved to reflect human perception of the compositional structure of objects, essential to their recognition and use. For training, we use the paired geometry / language data collected in the ShapeGlot work for their reference game, where a speaker creates an utterance to differentiate a target shape from two distractors and the listener has to find the target based on this utterance. Our network is designed to solve this target discrimination problem, carefully incorporating a Transformer-based attention module so that the output attention can precisely highlight the semantic part or…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Human Motion and Animation · Human Pose and Action Recognition