TL;DR
This paper introduces visuomotor affordance learning (VAL), a method enabling robots to rapidly adapt to new environments by sampling and training on potential outcomes using generative models, improving zero-shot generalization.
Contribution
The paper presents a novel approach using generative models to learn visual affordances, allowing robots to sample possible outcomes and quickly adapt to new tasks with minimal online training.
Findings
VAL enables rapid adaptation to new environments.
Robots can learn new manipulation skills with only five minutes of online data.
The approach improves zero-shot generalization in real-world tasks.
Abstract
A generalist robot equipped with learned skills must be able to perform many tasks in many different environments. However, zero-shot generalization to new settings is not always possible. When the robot encounters a new environment or object, it may need to finetune some of its previously learned skills to accommodate this change. But crucially, previously learned behaviors and models should still be suitable to accelerate this relearning. In this paper, we aim to study how generative models of possible outcomes can allow a robot to learn visual representations of affordances, so that the robot can sample potentially possible outcomes in new situations, and then further train its policy to achieve those outcomes. In effect, prior data is used to learn what kinds of outcomes may be possible, such that when the robot encounters an unfamiliar setting, it can sample potential outcomes from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
