Q-CP: Learning Action Values for Cooperative Planning

Francesco Riccio; Roberto Capobianco; Daniele Nardi

arXiv:1803.00297·cs.RO·March 2, 2018

Q-CP: Learning Action Values for Cooperative Planning

Francesco Riccio, Roberto Capobianco, Daniele Nardi

PDF

TL;DR

Q-CP is a novel cooperative reinforcement learning algorithm that uses action values to improve exploration and efficiency in multi-robot planning tasks with high complexity and uncertainty.

Contribution

It introduces a model-based RL method combining Q-learning with Monte-Carlo Tree Search to enhance cooperative multi-robot planning under uncertainty.

Findings

01

Q-CP reduces computational demand in multi-robot planning.

02

Q-CP achieves effective coordination in various robot scenarios.

03

Q-CP outperforms baseline methods in stochastic cooperative games.

Abstract

Research on multi-robot systems has demonstrated promising results in manifold applications and domains. Still, efficiently learning an effective robot behaviors is very difficult, due to unstructured scenarios, high uncertainties, and large state dimensionality (e.g. hyper-redundant and groups of robot). To alleviate this problem, we present Q-CP a cooperative model-based reinforcement learning algorithm, which exploits action values to both (1) guide the exploration of the state space and (2) generate effective policies. Specifically, we exploit Q-learning to attack the curse-of-dimensionality in the iterations of a Monte-Carlo Tree Search. We implement and evaluate Q-CP on different stochastic cooperative (general-sum) games: (1) a simple cooperative navigation problem among 3 robots, (2) a cooperation scenario between a pair of KUKA YouBots performing hand-overs, and (3) a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMonte-Carlo Tree Search · Q-Learning