PRISM: Personalized Refinement of Imitation Skills for Manipulation via Human Instructions
Arnau Boix-Granell, Alberto San-Miguel-Tello, Mag\'i Dalmau-Moreno, N\'estor Garc\'ia

TL;DR
PRISM is a method that combines imitation and reinforcement learning, using human instructions to refine robotic manipulation policies for better robustness and data efficiency in unseen tasks.
Contribution
It introduces a novel instruction-conditioned refinement approach that integrates human feedback and RL to adapt imitation policies to new goals and constraints.
Findings
Outperforms non-feedback policies in simulated pick-and-place tasks
Enhances robustness and reduces computational burden
Enables policy reusability and data efficiency
Abstract
This paper presents PRISM: an instruction-conditioned refinement method for imitation policies in robotic manipulation. This approach bridges Imitation Learning (IL) and Reinforcement Learning (RL) frameworks into a seamless pipeline, such that an imitation policy on a broad generic task, generated from a set of user-guided demonstrations, can be refined through reinforcement to generate new unseen fine-grain behaviours. The refinement process follows the Eureka paradigm, where reward functions for RL are iteratively generated from an initial natural-language task description. Presented approach, builds on top of this mechanism to adapt a refined IL policy of a generic task to new goal configurations and the introduction of constraints by adding also human feedback correction on intermediate rollouts, enabling policy reusability and therefore data efficiency. Results for a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Reinforcement Learning in Robotics · Robotic Path Planning Algorithms
