Learning Cooperation and Online Planning Through Simulation and Graph Convolutional Network
Rafid Ameer Mahmud, Fahim Faisal, Saaduddin Mahmud, Md. Mosaddek Khan

TL;DR
This paper introduces SiCLOP, a novel online planning algorithm for multi-agent cooperative environments that combines MCTS, GCN, and CG to learn cooperation, adapt to dynamic environments, and improve scalability, outperforming existing methods.
Contribution
The paper presents SiCLOP, a new simulation-based online planning algorithm that integrates GCN and CG with MCTS, supporting transfer learning and real-time solutions in dynamic multi-agent settings.
Findings
SiCLOP outperforms state-of-the-art online planning algorithms.
Supports transfer learning across different environments.
Provides theoretical convergence analysis for multi-agent scenarios.
Abstract
Multi-agent Markov Decision Process (MMDP) has been an effective way of modelling sequential decision making algorithms for multi-agent cooperative environments. A number of algorithms based on centralized and decentralized planning have been developed in this domain. However, dynamically changing environment, coupled with exponential size of the state and joint action space, make it difficult for these algorithms to provide both efficiency and scalability. Recently, Centralized planning algorithm FV-MCTS-MP and decentralized planning algorithm \textit{Alternate maximization with Behavioural Cloning} (ABC) have achieved notable performance in solving MMDPs. However, they are not capable of adapting to dynamically changing environments and accounting for the lack of communication among agents, respectively. Against this background, we introduce a simulation based online planning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Artificial Intelligence in Games · Advanced Graph Neural Networks
MethodsGraph Neural Network · Pruning · Approximate Bayesian Computation
