In-Context Iterative Policy Improvement for Dynamic Manipulation
Mark Van der Merwe, Devesh Jha

TL;DR
This paper introduces an iterative in-context learning approach for dynamic manipulation tasks using pre-trained language models, demonstrating superior performance in low-data scenarios on simulation and real robot experiments.
Contribution
It presents a novel iterative in-context learning method for dynamic manipulation, addressing challenges like high dimensionality and partial observability with pre-trained language models.
Findings
Outperforms alternative methods in simulation and on physical robots
Effective in low-data regimes for complex manipulation tasks
Demonstrates the applicability of language models to robotics
Abstract
Attention-based architectures trained on internet-scale language data have demonstrated state of the art reasoning ability for various language-based tasks, such as logic problems and textual reasoning. Additionally, these Large Language Models (LLMs) have exhibited the ability to perform few-shot prediction via in-context learning, in which input-output examples provided in the prompt are generalized to new inputs. This ability furthermore extends beyond standard language tasks, enabling few-shot learning for general patterns. In this work, we consider the application of in-context learning with pre-trained language models for dynamic manipulation. Dynamic manipulation introduces several crucial challenges, including increased dimensionality, complex dynamics, and partial observability. To address this, we take an iterative approach, and formulate our in-context learning problem to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Reinforcement Learning in Robotics · Topic Modeling
