In-Context Learning Enables Robot Action Prediction in LLMs
Yida Yin, Zekai Wang, Yuvan Sharma, Dantong Niu, Trevor Darrell, Roei, Herzig

TL;DR
This paper introduces RoboPrompt, a framework that leverages in-context learning in large language models to predict robot actions directly from textual descriptions without additional training, demonstrating improved performance in both simulated and real-world tasks.
Contribution
The paper presents RoboPrompt, a novel method that enables off-the-shelf LLMs to predict robot actions through ICL, bypassing the need for training on specific robot data.
Findings
RoboPrompt outperforms zero-shot baselines in robot action prediction.
The approach is effective in both simulated and real-world environments.
Structured ICL demonstrations improve prediction accuracy.
Abstract
Recently, Large Language Models (LLMs) have achieved remarkable success using in-context learning (ICL) in the language domain. However, leveraging the ICL capabilities within LLMs to directly predict robot actions remains largely unexplored. In this paper, we introduce RoboPrompt, a framework that enables off-the-shelf text-only LLMs to directly predict robot actions through ICL without training. Our approach first heuristically identifies keyframes that capture important moments from an episode. Next, we extract end-effector actions from these keyframes as well as the estimated initial object poses, and both are converted into textual descriptions. Finally, we construct a structured template to form ICL demonstrations from these textual descriptions and a task instruction. This enables an LLM to directly predict robot actions at test time. Through extensive experiments and analysis,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsContext-Aware Activity Recognition Systems
