Robust Instant Policy: Leveraging Student's t-Regression Model for Robust In-context Imitation Learning of Robot Manipulation

Hanbit Oh; Andrea M. Salcedo-V\'azquez; Ixchel G. Ramirez-Alpizar; and Yukiyasu Domae

arXiv:2506.15157·cs.RO·June 19, 2025

Robust Instant Policy: Leveraging Student's t-Regression Model for Robust In-context Imitation Learning of Robot Manipulation

Hanbit Oh, Andrea M. Salcedo-V\'azquez, Ixchel G. Ramirez-Alpizar, and Yukiyasu Domae

PDF

Open Access

TL;DR

This paper introduces a robust in-context imitation learning algorithm for robot manipulation that uses Student's t-regression to mitigate hallucination issues in large language model-based policies, significantly improving task success rates.

Contribution

The paper proposes the robust instant policy (RIP) algorithm, integrating Student's t-regression with in-context IL to enhance robustness against hallucinations in LLM-based robot policies.

Findings

01

RIP outperforms state-of-the-art IL methods by at least 26% in success rates.

02

RIP is effective in both simulated and real-world environments.

03

RIP particularly improves performance in low-data scenarios.

Abstract

Imitation learning (IL) aims to enable robots to perform tasks autonomously by observing a few human demonstrations. Recently, a variant of IL, called In-Context IL, utilized off-the-shelf large language models (LLMs) as instant policies that understand the context from a few given demonstrations to perform a new task, rather than explicitly updating network models with large-scale demonstrations. However, its reliability in the robotics domain is undermined by hallucination issues such as LLM-based instant policy, which occasionally generates poor trajectories that deviate from the given demonstrations. To alleviate this problem, we propose a new robust in-context imitation learning algorithm called the robust instant policy (RIP), which utilizes a Student's t-regression model to be robust against the hallucinated trajectories of instant policies to allow reliable trajectory…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Machine Learning and Data Classification