Successor Features for Transfer in Alternating Markov Games

Sunny Amatya; Yi Ren; Zhe Xu; and Wenlong Zhang

arXiv:2507.22278·cs.MA·July 31, 2025

Successor Features for Transfer in Alternating Markov Games

Sunny Amatya, Yi Ren, Zhe Xu, and Wenlong Zhang

PDF

TL;DR

This paper introduces a novel transfer learning algorithm for multi-agent Markov games using successor features, enabling effective policy transfer and improved performance in turn-based games.

Contribution

It proposes the GGPI algorithm that applies successor features to multi-agent games, providing a new method for knowledge transfer across different game tasks.

Findings

01

GGPI achieves high-reward interactions in experiments.

02

It enables one-shot policy transfer.

03

It outperforms baseline algorithms in success rate and path efficiency.

Abstract

This paper explores successor features for knowledge transfer in zero-sum, complete-information, and turn-based games. Prior research in single-agent systems has shown that successor features can provide a ``jump start" for agents when facing new tasks with varying reward structures. However, knowledge transfer in games typically relies on value and equilibrium transfers, which heavily depends on the similarity between tasks. This reliance can lead to failures when the tasks differ significantly. To address this issue, this paper presents an application of successor features to games and presents a novel algorithm called Game Generalized Policy Improvement (GGPI), designed to address Markov games in multi-agent reinforcement learning. The proposed algorithm enables the transfer of learning values and policies across games. An upper bound of the errors for transfer is derived as a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.