Combining Counterfactual Regret Minimization with Information Gain to Solve Extensive Games with Unknown Environments
Chen Qiu, Xuan Wang, Tianzi Ma, Yaojun Wen, Jiajia Zhang

TL;DR
This paper introduces a novel method combining counterfactual regret minimization with information gain to efficiently solve extensive games in unknown environments, enhancing exploration and NE approximation accuracy.
Contribution
It proposes a curiosity-driven approach that integrates information gain into CFR, enabling effective NE computation in uncertain environments with fewer interactions.
Findings
Reduces environment interactions compared to baselines.
Achieves more accurate NE approximation.
Demonstrates effectiveness on Kuhn and Leduc poker.
Abstract
Counterfactual regret minimization (CFR) is an effective algorithm for solving extensive games with imperfect information (IIEGs). However, CFR is only allowed to be applied in known environments, where the transition function of the chance player and the reward function of the terminal node in IIEGs are known. In uncertain situations, such as reinforcement learning (RL) problems, CFR is not applicable. Thus, applying CFR in unknown environments is a significant challenge that can also address some difficulties in the real world. Currently, advanced solutions require more interactions with the environment and are limited by large single-sampling variances to narrow the gap with the real environment. In this paper, we propose a method that combines CFR with information gain to compute the Nash equilibrium (NE) of IIEGs with unknown environments. We use a curiosity-driven approach to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Games · Reinforcement Learning in Robotics · Advanced Bandit Algorithms Research
