Autonomous Improvement of Instruction Following Skills via Foundation Models
Zhiyuan Zhou, Pranav Atreya, Abraham Lee, Homer Walke, Oier Mees,, Sergey Levine

TL;DR
This paper presents a novel framework enabling instruction-following robots to autonomously improve their skills by collecting and learning from large-scale, non-annotated experience data in diverse environments using vision-language models.
Contribution
The paper introduces a new approach that automates data collection and learning from autonomous, non-optimal data without human supervision, enhancing robot instruction-following capabilities.
Findings
Robot policy improved 2x in unseen environments
Collected 30.5K autonomous trajectories across five environments
Demonstrated effectiveness in real-world experiments
Abstract
Intelligent instruction-following robots capable of improving from autonomously collected experience have the potential to transform robot learning: instead of collecting costly teleoperated demonstration data, large-scale deployment of fleets of robots can quickly collect larger quantities of autonomous data that can collectively improve their performance. However, autonomous improvement requires solving two key problems: (i) fully automating a scalable data collection procedure that can collect diverse and semantically meaningful robot data and (ii) learning from non-optimal, autonomous data with no human annotations. To this end, we propose a novel approach that addresses these challenges, allowing instruction-following policies to improve from autonomously collected data without human supervision. Our framework leverages vision-language models to collect and evaluate semantically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning
