Inner Monologue: Embodied Reasoning through Planning with Language Models
Wenlong Huang, Fei Xia, Ted Xiao, Harris Chan, Jacky Liang, Pete, Florence, Andy Zeng, Jonathan Tompson, Igor Mordatch, Yevgen Chebotar, Pierre, Sermanet, Noah Brown, Tomas Jackson, Linda Luu, Sergey Levine, Karol Hausman,, Brian Ichter

TL;DR
This paper explores how Large Language Models can use natural language feedback to improve planning and reasoning in embodied robotic tasks without additional training, by forming an inner monologue for better decision-making.
Contribution
It introduces the concept of leveraging environment feedback for LLMs to enhance embodied reasoning through an inner monologue, improving task performance in robotic control scenarios.
Findings
Language feedback improves instruction completion in robotic tasks
Inner monologue enables better reasoning in embodied environments
Effective across simulated and real-world tasks
Abstract
Recent works have shown how the reasoning capabilities of Large Language Models (LLMs) can be applied to domains beyond natural language processing, such as planning and interaction for robots. These embodied problems require an agent to understand many semantic aspects of the world: the repertoire of skills available, how these skills influence the world, and how changes to the world map back to the language. LLMs planning in embodied environments need to consider not just what skills to do, but also how and when to do them - answers that change over time in response to the agent's own choices. In this work, we investigate to what extent LLMs used in such embodied contexts can reason over sources of feedback provided through natural language, without any additional training. We propose that by leveraging environment feedback, LLMs are able to form an inner monologue that allows them to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Natural Language Processing Techniques
