Dynamic Non-Prehensile Object Transport via Model-Predictive Reinforcement Learning
Neel Jawale, Byron Boots, Balakumar Sundaralingam, Mohak Bhardwaj

TL;DR
This paper introduces a reinforcement learning approach combined with model-predictive control for teaching robots to perform dynamic non-prehensile object transport tasks efficiently from limited demonstration data, ensuring robustness and generalization.
Contribution
The paper presents a novel method integrating batch RL with MPC using ensemble value functions pretrained on demonstrations, enabling robust, data-efficient learning of complex manipulation tasks.
Findings
Successfully trained on 50-100 demonstrations with real-world robots.
Achieved robust generalization to unseen objects.
Demonstrated improved performance over suboptimal demonstrations.
Abstract
We investigate the problem of teaching a robot manipulator to perform dynamic non-prehensile object transport, also known as the `robot waiter' task, from a limited set of real-world demonstrations. We propose an approach that combines batch reinforcement learning (RL) with model-predictive control (MPC) by pretraining an ensemble of value functions from demonstration data, and utilizing them online within an uncertainty-aware MPC scheme to ensure robustness to limited data coverage. Our approach is straightforward to integrate with off-the-shelf MPC frameworks and enables learning solely from task space demonstrations with sparsely labeled transitions, while leveraging MPC to ensure smooth joint space motions and constraint satisfaction. We validate the proposed approach through extensive simulated and real-world experiments on a Franka Panda robot performing the robot waiter task and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Industrial Vision Systems and Defect Detection · Robot Manipulation and Learning
MethodsSparse Evolutionary Training
