Capturing Fine-Grained Alignments Improves 3D Affordance Detection
Junsei Tokumitsu, Yuiga Wada

TL;DR
This paper introduces LM-AD and AQM, novel methods that improve 3D affordance detection by capturing fine-grained alignments between point clouds and text, outperforming existing approaches on standard benchmarks.
Contribution
We propose LM-AD and AQM, which enhance 3D affordance detection by modeling detailed alignments using pretrained language models, addressing limitations of previous cosine similarity-based methods.
Findings
Outperforms existing methods in accuracy
Achieves higher mean Intersection over Union
Demonstrates effectiveness on 3D AffordanceNet dataset
Abstract
In this work, we address the challenge of affordance detection in 3D point clouds, a task that requires effectively capturing fine-grained alignments between point clouds and text. Existing methods often struggle to model such alignments, resulting in limited performance on standard benchmarks. A key limitation of these approaches is their reliance on simple cosine similarity between point cloud and text embeddings, which lacks the expressiveness needed for fine-grained reasoning. To address this limitation, we propose LM-AD, a novel method for affordance detection in 3D point clouds. Moreover, we introduce the Affordance Query Module (AQM), which efficiently captures fine-grained alignment between point clouds and text by leveraging a pretrained language model. We demonstrated that our method outperformed existing approaches in terms of accuracy and mean Intersection over Union on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIndustrial Vision Systems and Defect Detection · Manufacturing Process and Optimization
