Graph-Structured Policy Learning for Multi-Goal Manipulation Tasks
David Klee, Ondrej Biza, Robert Platt

TL;DR
This paper introduces a novel pixel-based Q-learning approach with a high-level discrete domain representation, enabling efficient multi-goal robotic manipulation and transfer from simulation to real robots.
Contribution
It presents a new method combining high-level discrete representations with pixel-based Q-learning for multi-goal manipulation, improving scalability and transferability.
Findings
Learned over a hundred block structures.
Achieved forward transfer to novel objects.
Successfully deployed policies on real robots.
Abstract
Multi-goal policy learning for robotic manipulation is challenging. Prior successes have used state-based representations of the objects or provided demonstration data to facilitate learning. In this paper, by hand-coding a high-level discrete representation of the domain, we show that policies to reach dozens of goals can be learned with a single network using Q-learning from pixels. The agent focuses learning on simpler, local policies which are sequenced together by planning in the abstract space. We compare our method against standard multi-goal RL baselines, as well as other methods that leverage the discrete representation, on a challenging block construction domain. We find that our method can build more than a hundred different block structures, and demonstrate forward transfer to structures with novel objects. Lastly, we deploy the policy learned in simulation on a real robot.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Reinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning
