Model-based Reinforcement Learning for Decentralized Multiagent   Rendezvous

Rose E. Wang; J. Chase Kew; Dennis Lee; Tsang-Wei Edward Lee; Tingnan; Zhang; Brian Ichter; Jie Tan; Aleksandra Faust

arXiv:2003.06906·cs.MA·November 10, 2020·5 cites

Model-based Reinforcement Learning for Decentralized Multiagent Rendezvous

Rose E. Wang, J. Chase Kew, Dennis Lee, Tsang-Wei Edward Lee, Tingnan, Zhang, Brian Ichter, Jie Tan, Aleksandra Faust

PDF

Open Access

TL;DR

This paper introduces hierarchical predictive planning (HPP), a model-based reinforcement learning approach enabling decentralized multiagent rendezvous through learned motion predictions, outperforming baselines in complex environments and transferring from simulation to real-world without fine-tuning.

Contribution

The paper presents HPP, a novel hierarchical predictive planning method that combines self-supervised motion prediction with decentralized decision-making for multiagent rendezvous.

Findings

01

HPP outperforms baseline methods in complex, unseen environments.

02

Prediction models transfer successfully from simulation to real-world without fine-tuning.

03

HPP enables decentralized coordination without explicit communication.

Abstract

Collaboration requires agents to align their goals on the fly. Underlying the human ability to align goals with other agents is their ability to predict the intentions of others and actively update their own plans. We propose hierarchical predictive planning (HPP), a model-based reinforcement learning method for decentralized multiagent rendezvous. Starting with pretrained, single-agent point to point navigation policies and using noisy, high-dimensional sensor inputs like lidar, we first learn via self-supervision motion predictions of all agents on the team. Next, HPP uses the prediction models to propose and evaluate navigation subgoals for completing the rendezvous task without explicit communication among agents. We evaluate HPP in a suite of unseen environments, with increasing complexity and numbers of obstacles. We show that HPP outperforms alternative reinforcement learning,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robotic Path Planning Algorithms · Distributed Control Multi-Agent Systems