Waypoint-Based Reinforcement Learning for Robot Manipulation Tasks

Shaunak A. Mehta; Soheil Habibian; Dylan P. Losey

arXiv:2403.13281·cs.RO·March 21, 2024·1 cites

Waypoint-Based Reinforcement Learning for Robot Manipulation Tasks

Shaunak A. Mehta, Soheil Habibian, Dylan P. Losey

PDF

Open Access 1 Repo

TL;DR

This paper introduces a waypoint-based reinforcement learning framework for robot manipulation, reformulating the problem as a sequence of bandit tasks, which improves learning speed and efficiency over traditional low-level policy methods.

Contribution

The paper proposes a novel waypoint-based approach for model-free reinforcement learning, framing it as multiple bandit problems, with theoretical regret bounds and practical superior performance.

Findings

01

Faster learning of new tasks compared to baselines

02

Lower regret bounds in the waypoint bandit formulation

03

Effective real-world robot manipulation demonstrated

Abstract

Robot arms should be able to learn new tasks. One framework here is reinforcement learning, where the robot is given a reward function that encodes the task, and the robot autonomously learns actions to maximize its reward. Existing approaches to reinforcement learning often frame this problem as a Markov decision process, and learn a policy (or a hierarchy of policies) to complete the task. These policies reason over hundreds of fine-grained actions that the robot arm needs to take: e.g., moving slightly to the right or rotating the end-effector a few degrees. But the manipulation tasks that we want robots to perform can often be broken down into a small number of high-level motions: e.g., reaching an object or turning a handle. In this paper we therefore propose a waypoint-based approach for model-free reinforcement learning. Instead of learning a low-level policy, the robot now…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

vt-collab/rl-waypoints
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning