Decoding Rewards in Competitive Games: Inverse Game Theory with Entropy Regularization

Junyi Liao; Zihan Zhu; Ethan Fang; Zhuoran Yang; Vahid Tarokh

arXiv:2601.12707·cs.LG·May 20, 2026

Decoding Rewards in Competitive Games: Inverse Game Theory with Entropy Regularization

Junyi Liao, Zihan Zhu, Ethan Fang, Zhuoran Yang, Vahid Tarokh

PDF

1 Video

TL;DR

This paper introduces a unified framework for recovering reward functions in competitive games using inverse game theory with entropy regularization, supported by theoretical guarantees and practical algorithms.

Contribution

It establishes reward identifiability via QRE, proposes a novel algorithm for reward learning from observed actions, and provides theoretical and empirical validation.

Findings

01

Reward functions are identifiable under linear assumptions using QRE.

02

The proposed algorithm works in static and dynamic settings.

03

Numerical studies demonstrate the framework's effectiveness.

Abstract

Estimating the unknown reward functions driving agents' behaviors is of central interest in inverse reinforcement learning and game theory. To tackle this problem, we develop a unified framework for reward function recovery in two-player zero-sum matrix games and Markov games with entropy regularization, where we aim to reconstruct the underlying reward functions given observed players' strategies and actions. This task is challenging due to the inherent ambiguity of inverse problems, the non-uniqueness of feasible rewards, and limited observational data coverage. To address these challenges, we establish the reward function's identifiability using the quantal response equilibrium (QRE) under linear assumptions. Building upon this theoretical foundation, we propose a novel algorithm to learn reward functions from observed actions. Our algorithm works in both static and dynamic settings…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Decoding Rewards in Competitive Games: Inverse Game Theory with Entropy Regularization· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Stochastic Gradient Optimization Techniques