PARIS3D: Reasoning-based 3D Part Segmentation Using Large Multimodal Model
Amrin Kareem, Jean Lahoud, and Hisham Cholakkal

TL;DR
This paper introduces PARIS3D, a novel reasoning-based 3D part segmentation task that interprets implicit textual queries to segment 3D objects and generate explanations, supported by a large curated dataset and a capable model.
Contribution
It presents a new reasoning-based segmentation task, a large dataset, and a model that understands implicit queries and reasons about 3D object parts.
Findings
Achieves competitive segmentation performance with implicit queries.
Can generate natural language explanations for segmentation.
Demonstrates reasoning and world knowledge integration.
Abstract
Recent advancements in 3D perception systems have significantly improved their ability to perform visual recognition tasks such as segmentation. However, these systems still heavily rely on explicit human instruction to identify target objects or categories, lacking the capability to actively reason and comprehend implicit user intentions. We introduce a novel segmentation task known as reasoning part segmentation for 3D objects, aiming to output a segmentation mask based on complex and implicit textual queries about specific parts of a 3D object. To facilitate evaluation and benchmarking, we present a large 3D dataset comprising over 60k instructions paired with corresponding ground-truth part segmentation annotations specifically curated for reasoning-based 3D part segmentation. We propose a model that is capable of segmenting parts of 3D objects based on implicit textual queries and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing and 3D Reconstruction · Advanced Neural Network Applications · Handwritten Text Recognition Techniques
