Do As I Can, Not As I Say: Grounding Language in Robotic Affordances
Michael Ahn, Anthony Brohan, Noah Brown, Yevgen Chebotar, Omar Cortes,, Byron David, Chelsea Finn, Chuyuan Fu, Keerthana Gopalakrishnan, Karol, Hausman, Alex Herzog, Daniel Ho, Jasmine Hsu, Julian Ibarz, Brian Ichter,, Alex Irpan, Eric Jang, Rosario Jauregui Ruano, Kyle Jeffrey

TL;DR
This paper introduces a method that combines large language models with pretrained robotic skills to enable robots to understand and execute complex natural language instructions effectively in real-world environments.
Contribution
The paper presents a novel framework that grounds language models with real-world robotic skills, improving task feasibility and contextual appropriateness for robotic applications.
Findings
Successfully completes long-horizon, natural language instructions in real-world robotic tasks.
Demonstrates the importance of real-world grounding for language-based robotic control.
Shows that combining language models with skills enhances task execution accuracy.
Abstract
Large language models can encode a wealth of semantic knowledge about the world. Such knowledge could be extremely useful to robots aiming to act upon high-level, temporally extended instructions expressed in natural language. However, a significant weakness of language models is that they lack real-world experience, which makes it difficult to leverage them for decision making within a given embodiment. For example, asking a language model to describe how to clean a spill might result in a reasonable narrative, but it may not be applicable to a particular agent, such as a robot, that needs to perform this task in a particular environment. We propose to provide real-world grounding by means of pretrained skills, which are used to constrain the model to propose natural language actions that are both feasible and contextually appropriate. The robot can act as the language model's "hands…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Author Interview: SayCan - Do As I Can, Not As I Say: Grounding Language in Robotic Affordances· youtube
Do As I Can, Not As I Say: Grounding Language in Robotic Affordances (SayCan - Paper Explained)· youtube
Google’s New Robot: Your Personal Butler! 🤖· youtube
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Natural Language Processing Techniques
