MOMA-AC: A preference-driven actor-critic framework for continuous multi-objective multi-agent reinforcement learning

Adam Callaghan; Karl Mason; Patrick Mannion

arXiv:2511.18181·cs.LG·November 25, 2025

MOMA-AC: A preference-driven actor-critic framework for continuous multi-objective multi-agent reinforcement learning

Adam Callaghan, Karl Mason, Patrick Mannion

PDF

Open Access

TL;DR

This paper introduces MOMA-AC, a novel actor-critic framework for continuous multi-objective multi-agent reinforcement learning, capable of encoding Pareto optimal policies across conflicting objectives with improved performance and scalability.

Contribution

The paper presents the first dedicated inner-loop actor-critic framework for continuous MOMARL, integrating preference-conditioning and scalable multi-agent capabilities.

Findings

01

Achieves significant improvements in expected utility and hypervolume.

02

Demonstrates stable scalability with increasing number of agents.

03

Provides a new test suite for continuous MOMARL evaluation.

Abstract

This paper addresses a critical gap in Multi-Objective Multi-Agent Reinforcement Learning (MOMARL) by introducing the first dedicated inner-loop actor-critic framework for continuous state and action spaces: Multi-Objective Multi-Agent Actor-Critic (MOMA-AC). Building on single-objective, single-agent algorithms, we instantiate this framework with Twin Delayed Deep Deterministic Policy Gradient (TD3) and Deep Deterministic Policy Gradient (DDPG), yielding MOMA-TD3 and MOMA-DDPG. The framework combines a multi-headed actor network, a centralised critic, and an objective preference-conditioning architecture, enabling a single neural network to encode the Pareto front of optimal trade-off policies for all agents across conflicting objectives in a continuous MOMARL setting. We also outline a natural test suite for continuous MOMARL by combining a pre-existing multi-agent single-objective…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Multi-Objective Optimization Algorithms · Adversarial Robustness in Machine Learning