Inverse Factorized Q-Learning for Cooperative Multi-agent Imitation Learning
The Viet Bui, Tien Mai, Thanh Hong Nguyen

TL;DR
This paper introduces a novel inverse factorized Q-learning algorithm for cooperative multi-agent imitation learning, effectively handling high-dimensional spaces and agent dependencies, and demonstrating superior performance in complex multi-agent environments.
Contribution
The paper proposes a new multi-agent imitation learning method using mixing networks for centralized training, with theoretical convexity conditions and extensive experimental validation.
Findings
Outperforms existing multi-agent IL algorithms in challenging environments
Successfully learns local and joint value functions in multi-agent settings
Demonstrates effectiveness on complex environments like SMACv2
Abstract
This paper concerns imitation learning (IL) (i.e, the problem of learning to mimic expert behaviors from demonstrations) in cooperative multi-agent systems. The learning problem under consideration poses several challenges, characterized by high-dimensional state and action spaces and intricate inter-agent dependencies. In a single-agent setting, IL has proven to be done efficiently through an inverse soft-Q learning process given expert demonstrations. However, extending this framework to a multi-agent context introduces the need to simultaneously learn both local value functions to capture local observations and individual actions, and a joint value function for exploiting centralized learning. In this work, we introduce a novel multi-agent IL algorithm designed to address these challenges. Our approach enables the centralized learning by leveraging mixing networks to aggregate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Anomaly Detection Techniques and Applications
