Graph Policy Gradients for Large Scale Robot Control

Arbaaz Khan; Ekaterina Tolstaya; Alejandro Ribeiro; Vijay Kumar

arXiv:1907.03822·cs.RO·December 3, 2019·21 cites

Graph Policy Gradients for Large Scale Robot Control

Arbaaz Khan, Ekaterina Tolstaya, Alejandro Ribeiro, Vijay Kumar

PDF

Open Access 1 Repo

TL;DR

This paper introduces Graph Policy Gradients, a scalable reinforcement learning method using graph convolutional networks to control large homogeneous robot swarms efficiently and transfer policies across different swarm sizes.

Contribution

The paper proposes a novel graph-based policy gradient algorithm that leverages graph symmetry and local filters for scalable, transferable control policies in large robot swarms.

Findings

01

Scales better than existing methods with fully connected networks.

02

Enables zero-shot transfer of policies from small to large robot groups.

03

Demonstrates effectiveness in formation flying tasks.

Abstract

In this paper, we consider the problem of learning policies to control a large number of homogeneous robots. To this end, we propose a new algorithm we call Graph Policy Gradients (GPG) that exploits the underlying graph symmetry among the robots. The curse of dimensionality one encounters when working with a large number of robots is mitigated by employing a graph convolutional neural (GCN) network to parametrize policies for the robots. The GCN reduces the dimensionality of the problem by learning filters that aggregate information among robots locally, similar to how a convolutional neural network is able to learn local features in an image. Through experiments on formation flying, we show that our proposed method is able to scale better than existing reinforcement methods that employ fully connected networks. More importantly, we show that by using our locally learned filters we are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

arbaazkhan2/gpg_labeled
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Optimization and Search Problems · Advanced Bandit Algorithms Research