AffordMatcher: Affordance Learning in 3D Scenes from Visual Signifiers
Nghia Vu, Tuong Do, Khang Nguyen, Baoru Huang, Nhat Le, Binh Xuan Nguyen, Erman Tjiputra, Quang D. Tran, Ravi Prakash, Te-Chuan Chiu, Anh Nguyen

TL;DR
This paper introduces AffordMatcher, a novel method leveraging a large-scale dataset to improve affordance learning in 3D scenes by matching visual signifiers across modalities.
Contribution
The work presents AffordBridge, a comprehensive dataset, and AffordMatcher, a new approach for semantic correspondence and affordance detection in complex indoor scenes.
Findings
AffordMatcher outperforms existing methods in identifying affordance regions.
The dataset contains 291,637 annotations across 685 indoor scenes.
Experimental results validate the effectiveness of the proposed approach.
Abstract
Affordance learning is a complex challenge in many applications, where existing approaches primarily focus on the geometric structures, visual knowledge, and affordance labels of objects to determine interactable regions. However, extending this learning capability to a scene is significantly more complicated, as incorporating object- and scene-level semantics is not straightforward. In this work, we introduce AffordBridge, a large-scale dataset with 291,637 functional interaction annotations across 685 high-resolution indoor scenes in the form of point clouds. Our affordance annotations are complemented by RGB images that are linked to the same instances within the scenes. Building upon our dataset, we propose AffordMatcher, an affordance learning method that establishes coherent semantic correspondences between image-based and point cloud-based instances for keypoint matching,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
