Robo-ABC: Affordance Generalization Beyond Categories via Semantic Correspondence for Robot Manipulation
Yuanchen Ju, Kaizhe Hu, Guowei Zhang, Gu Zhang, Mingrun Jiang, Huazhe, Xu

TL;DR
Robo-ABC enables robots to generalize affordance understanding to novel objects by leveraging semantic correspondence and diffusion models, achieving zero-shot manipulation without manual annotations or additional training.
Contribution
The paper introduces Robo-ABC, a novel framework that uses semantic correspondence and diffusion models to enable zero-shot affordance generalization across object categories.
Findings
Improves affordance retrieval accuracy by 31.6% over SOTA models.
Achieves 85.7% success rate in real-world cross-category grasping tasks.
Enables manipulation of out-of-category objects without manual annotations.
Abstract
Enabling robotic manipulation that generalizes to out-of-distribution scenes is a crucial step toward open-world embodied intelligence. For human beings, this ability is rooted in the understanding of semantic correspondence among objects, which naturally transfers the interaction experience of familiar objects to novel ones. Although robots lack such a reservoir of interaction experience, the vast availability of human videos on the Internet may serve as a valuable resource, from which we extract an affordance memory including the contact points. Inspired by the natural way humans think, we propose Robo-ABC: when confronted with unfamiliar objects that require generalization, the robot can acquire affordance by retrieving objects that share visual or semantic similarities from the affordance memory. The next step is to map the contact points of the retrieved objects to the new object.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Robot Manipulation and Learning · Domain Adaptation and Few-Shot Learning
MethodsDiffusion
