Multi-agent imitation learning with function approximation: Linear Markov games and beyond

Luca Viano; Till Freihaut; Emanuele Nevali; Volkan Cevher; Matthieu Geist; Giorgia Ramponi

arXiv:2602.22810·cs.LG·February 27, 2026

Multi-agent imitation learning with function approximation: Linear Markov games and beyond

Luca Viano, Till Freihaut, Emanuele Nevali, Volkan Cevher, Matthieu Geist, Giorgia Ramponi

PDF

Open Access

TL;DR

This paper provides the first theoretical analysis of multi-agent imitation learning in linear Markov games, introducing feature-based concentrability coefficients, an efficient interactive algorithm, and a deep learning approach that outperforms behavioral cloning in certain games.

Contribution

It introduces the first theoretical framework for MAIL in linear Markov games, including a feature-based concentrability measure and an efficient interactive algorithm.

Findings

01

Feature-based concentrability coefficients can be smaller than state-action ones.

02

The proposed interactive MAIL algorithm has sample complexity depending only on feature dimension.

03

Deep MAIL outperforms behavioral cloning in Tic-Tac-Toe and Connect4.

Abstract

In this work, we present the first theoretical analysis of multi-agent imitation learning (MAIL) in linear Markov games where both the transition dynamics and each agent's reward function are linear in some given features. We demonstrate that by leveraging this structure, it is possible to replace the state-action level "all policy deviation concentrability coefficient" (Freihaut et al., arXiv:2510.09325) with a concentrability coefficient defined at the feature level which can be much smaller than the state-action analog when the features are informative about states' similarity. Furthermore, to circumvent the need for any concentrability coefficient, we turn to the interactive setting. We provide the first, computationally efficient, interactive MAIL algorithm for linear Markov games and show that its sample complexity depends only on the dimension of the feature map $d$ . Building on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Game Theory and Applications · Adaptive Dynamic Programming Control