Scalable Accelerated Decentralized Multi-Robot Policy Search in Continuous Observation Spaces
Shayegan Omidshafiei, Christopher Amato, Miao Liu, Michael Everett,, Jonathan P. How, John Vian

TL;DR
This paper introduces a novel approach for solving continuous-observation Dec-POMDPs and Dec-POSMDPs in robotics using SK-FSAs, significantly improving scalability and convergence over existing discrete methods.
Contribution
It presents the first continuous-observation policy search method for Dec-POMDPs/Dec-POSMDPs using SK-FSAs and introduces an entropy injection technique for faster convergence.
Findings
Outperforms state-of-the-art discrete approaches in continuous domains
Demonstrates scalability to larger multi-robot systems
Entropy injection accelerates policy convergence without quality loss
Abstract
This paper presents the first ever approach for solving \emph{continuous-observation} Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) and their semi-Markovian counterparts, Dec-POSMDPs. This contribution is especially important in robotics, where a vast number of sensors provide continuous observation data. A continuous-observation policy representation is introduced using Stochastic Kernel-based Finite State Automata (SK-FSAs). An SK-FSA search algorithm titled Entropy-based Policy Search using Continuous Kernel Observations (EPSCKO) is introduced and applied to the first ever continuous-observation Dec-POMDP/Dec-POSMDP domain, where it significantly outperforms state-of-the-art discrete approaches. This methodology is equally applicable to Dec-POMDPs and Dec-POSMDPs, though the empirical analysis presented focuses on Dec-POSMDPs due to their higher…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Optimization and Search Problems · Distributed systems and fault tolerance
