Multi-Agent Learning in Contextual Games under Unknown Constraints

Anna M. Maddux; Maryam Kamgarpour

arXiv:2310.14685·cs.GT·May 27, 2024·1 cites

Multi-Agent Learning in Contextual Games under Unknown Constraints

Anna M. Maddux, Maryam Kamgarpour

PDF

Open Access

TL;DR

This paper introduces a novel learning algorithm for multi-agent contextual games with unknown constraints, ensuring no constraint violations over time and achieving low regret, with theoretical guarantees and practical validation.

Contribution

It develops c.z.AdaNormalGP, a no-regret, no-violation algorithm for unknown constrained contextual games, with kernel-based regret bounds and equilibrium concepts.

Findings

01

Algorithm achieves sublinear regret bounds.

02

Constraint violations diminish to zero over time.

03

Effective in multi-agent reinforcement learning experiments.

Abstract

We consider the problem of learning to play a repeated contextual game with unknown reward and unknown constraints functions. Such games arise in applications where each agent's action needs to belong to a feasible set, but the feasible set is a priori unknown. For example, in constrained multi-agent reinforcement learning, the constraints on the agents' policies are a function of the unknown dynamics and hence, are themselves unknown. Under kernel-based regularity assumptions on the unknown functions, we develop a no-regret, no-violation approach which exploits similarities among different reward and constraint outcomes. The no-violation property ensures that the time-averaged sum of constraint violations converges to zero as the game is repeated. We show that our algorithm, referred to as c.z.AdaNormalGP, obtains kernel-dependent regret bounds and that the cumulative constraint…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Decision-Making and Behavioral Economics