Graph-Structured Policy Learning for Multi-Goal Manipulation Tasks

David Klee; Ondrej Biza; Robert Platt

arXiv:2207.11313·cs.RO·July 26, 2022

Graph-Structured Policy Learning for Multi-Goal Manipulation Tasks

David Klee, Ondrej Biza, Robert Platt

PDF

Open Access

TL;DR

This paper introduces a novel pixel-based Q-learning approach with a high-level discrete domain representation, enabling efficient multi-goal robotic manipulation and transfer from simulation to real robots.

Contribution

It presents a new method combining high-level discrete representations with pixel-based Q-learning for multi-goal manipulation, improving scalability and transferability.

Findings

01

Learned over a hundred block structures.

02

Achieved forward transfer to novel objects.

03

Successfully deployed policies on real robots.

Abstract

Multi-goal policy learning for robotic manipulation is challenging. Prior successes have used state-based representations of the objects or provided demonstration data to facilitate learning. In this paper, by hand-coding a high-level discrete representation of the domain, we show that policies to reach dozens of goals can be learned with a single network using Q-learning from pixels. The agent focuses learning on simpler, local policies which are sequenced together by planning in the abstract space. We compare our method against standard multi-goal RL baselines, as well as other methods that leverage the discrete representation, on a challenging block construction domain. We find that our method can build more than a hundred different block structures, and demonstrate forward transfer to structures with novel objects. Lastly, we deploy the policy learned in simulation on a real robot.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Reinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning