MOMA-AC: A preference-driven actor-critic framework for continuous multi-objective multi-agent reinforcement learning
Adam Callaghan, Karl Mason, Patrick Mannion

TL;DR
This paper introduces MOMA-AC, a novel actor-critic framework for continuous multi-objective multi-agent reinforcement learning, capable of encoding Pareto optimal policies across conflicting objectives with improved performance and scalability.
Contribution
The paper presents the first dedicated inner-loop actor-critic framework for continuous MOMARL, integrating preference-conditioning and scalable multi-agent capabilities.
Findings
Achieves significant improvements in expected utility and hypervolume.
Demonstrates stable scalability with increasing number of agents.
Provides a new test suite for continuous MOMARL evaluation.
Abstract
This paper addresses a critical gap in Multi-Objective Multi-Agent Reinforcement Learning (MOMARL) by introducing the first dedicated inner-loop actor-critic framework for continuous state and action spaces: Multi-Objective Multi-Agent Actor-Critic (MOMA-AC). Building on single-objective, single-agent algorithms, we instantiate this framework with Twin Delayed Deep Deterministic Policy Gradient (TD3) and Deep Deterministic Policy Gradient (DDPG), yielding MOMA-TD3 and MOMA-DDPG. The framework combines a multi-headed actor network, a centralised critic, and an objective preference-conditioning architecture, enabling a single neural network to encode the Pareto front of optimal trade-off policies for all agents across conflicting objectives in a continuous MOMARL setting. We also outline a natural test suite for continuous MOMARL by combining a pre-existing multi-agent single-objective…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Multi-Objective Optimization Algorithms · Adversarial Robustness in Machine Learning
