Metric-Gradient Projection for Stable Multi-Agent Policy Learning
Zuyuan Zhang, Sizhe Tang, Mahdi Imani, Tian Lan

TL;DR
This paper introduces HPML, a novel projection method for multi-agent learning that stabilizes training by decomposing update fields into metric-gradient components, improving convergence and stability.
Contribution
HPML provides a geometric projection approach in multi-agent reinforcement learning, offering a new way to stabilize and improve learning dynamics.
Findings
HPML yields more stable training in multi-agent systems.
Experiments show improved normalized returns with HPML as a plug-in layer.
The method admits a Lyapunov potential and provides equilibrium-gap bounds.
Abstract
General-sum multi-agent learning is often governed by a stacked update field in which each agent's policy update changes the optimization landscape faced by the others. This coupling can entangle an integrable component of collective improvement with cyclic interaction dynamics, leading to slow or unstable multi-agent learning. Existing approaches, such as regularization, credit assignment, and consensus methods, stabilize MARL through local or algorithmic modifications; HPML complements them by projecting the joint update field onto a metric-gradient component. We introduce \textbf{HPML} (\textbf{H}odge-\textbf{P}rojected \textbf{M}ulti-agent \textbf{L}earning), which views the joint update field of a multi-agent system as an element of an space of vector fields and computes a Hodge-type projection onto the closest metric-gradient potential flow. HPML follows the projected…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
