AffordGen: Generating Diverse Demonstrations for Generalizable Object Manipulation with Afford Correspondence

Jiawei Zhang; Kaizhe Hu; Yingqian Huang; Yuanchen Ju; Zhengrong Xue; Huazhe Xu

arXiv:2604.10579·cs.RO·April 14, 2026

AffordGen: Generating Diverse Demonstrations for Generalizable Object Manipulation with Afford Correspondence

Jiawei Zhang, Kaizhe Hu, Yingqian Huang, Yuanchen Ju, Zhengrong Xue, Huazhe Xu

PDF

TL;DR

AffordGen leverages 3D generative and vision foundation models to create diverse, affordance-aware demonstrations, enhancing robot manipulation generalization and data efficiency.

Contribution

It introduces a novel framework that uses semantic keypoints and generative models to produce diverse demonstrations for improved robot learning.

Findings

01

Policies trained with AffordGen achieve high success rates.

02

Zero-shot generalization to unseen objects is enabled.

03

Significant improvement in data efficiency for robot learning.

Abstract

Despite the recent success of modern imitation learning methods in robot manipulation, their performance is often constrained by geometric variations due to limited data diversity. Leveraging powerful 3D generative models and vision foundation models (VFMs), the proposed AffordGen framework overcomes this limitation by utilizing the semantic correspondence of meaningful keypoints across large-scale 3D meshes to generate new robot manipulation trajectories. This large-scale, affordance-aware dataset is then used to train a robust, closed-loop visuomotor policy, combining the semantic generalizability of affordances with the reactive robustness of end-to-end learning. Experiments in simulation and the real world show that policies trained with AffordGen achieve high success rates and enable zero-shot generalization to truly unseen objects, significantly improving data efficiency in robot…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.