GEAL: Generalizable 3D Affordance Learning with Cross-Modal Consistency

Dongyue Lu; Lingdong Kong; Tianxin Huang; Gim Hee Lee

arXiv:2412.09511·cs.CV·December 13, 2024

GEAL: Generalizable 3D Affordance Learning with Cross-Modal Consistency

Dongyue Lu, Lingdong Kong, Tianxin Huang, Gim Hee Lee

PDF

Open Access 1 Repo

TL;DR

GEAL is a framework that improves 3D affordance learning by leveraging large-scale pre-trained 2D models and cross-modal consistency, resulting in better generalization and robustness to real-world noise.

Contribution

It introduces a dual-branch architecture with cross-modal alignment and new corruption benchmarks to enhance 3D affordance learning robustness.

Findings

01

Outperforms existing methods on public datasets.

02

Shows robustness on corrupted data.

03

Effective cross-modal knowledge transfer.

Abstract

Identifying affordance regions on 3D objects from semantic cues is essential for robotics and human-machine interaction. However, existing 3D affordance learning methods struggle with generalization and robustness due to limited annotated data and a reliance on 3D backbones focused on geometric encoding, which often lack resilience to real-world noise and data corruption. We propose GEAL, a novel framework designed to enhance the generalization and robustness of 3D affordance learning by leveraging large-scale pre-trained 2D models. We employ a dual-branch architecture with Gaussian splatting to establish consistent mappings between 3D point clouds and 2D representations, enabling realistic 2D renderings from sparse point clouds. A granularity-adaptive fusion module and a 2D-3D consistency alignment module further strengthen cross-modal alignment and knowledge transfer, allowing the 3D…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

DylanOrange/geal
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Robotic Mechanisms and Dynamics · Model Reduction and Neural Networks