Submodular Multi-Agent Policy Learning for Online Distributed Task Allocation in Open Multi-Agent Systems

Jing Liu; Yangyang Yang; Luca Ballotta; Fangfei Li; Yang Tang; Ruggero Carli

arXiv:2605.13269·eess.SY·May 14, 2026

Submodular Multi-Agent Policy Learning for Online Distributed Task Allocation in Open Multi-Agent Systems

Jing Liu, Yangyang Yang, Luca Ballotta, Fangfei Li, Yang Tang, Ruggero Carli

PDF

TL;DR

This paper introduces a novel continuous relaxation called Partition Multilinear Extension for multi-agent reinforcement learning, enabling effective online distributed task allocation with theoretical guarantees and practical GNN-based implementations.

Contribution

It proposes the PME relaxation, a submodular difference reward-based policy-gradient framework, with proven approximation guarantees and applicability to open multi-agent systems.

Findings

01

SubMAPG outperforms local greedy and shared-reward baselines in experiments.

02

Theoretical guarantees include a 1/2-approximation and sublinear regret.

03

GNN policies effectively handle open systems with dynamic agents and targets.

Abstract

This paper studies multi-agent reinforcement learning with submodular team utilities for online distributed task allocation. In this setting, each agent selects one action from a local categorical policy, so feasible joint actions form a partition matroid over agent-action pairs. Classical multilinear extensions use independent Bernoulli sampling and therefore do not match the categorical policies executed by decentralized agents. To address this mismatch, we introduce the Partition Multilinear Extension (PME), a continuous relaxation whose value equals the expected team utility under factorized categorical policies. We prove that submodular difference rewards provide unbiased PME marginal-gradient information and yield a stagewise score-function policy-gradient estimator. Based on this connection, we propose SubMAPG, a centralized-training decentralized-execution policy-gradient…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.