RLCFR: Minimize Counterfactual Regret by Deep Reinforcement Learning
Huale Li, Xuan Wang, Fengwei Jia, Yifan Li, Yulin Wu, Jiajia Zhang,, Shuhan Qi

TL;DR
This paper introduces RLCFR, a reinforcement learning framework that enhances the generalization of counterfactual regret minimization in two-player zero-sum imperfect information games by learning adaptive regret updating policies.
Contribution
RLCFR models the CFR iterative process as an MDP and learns a policy for regret updating, improving generalization over existing methods.
Findings
Significantly improved generalization ability on various games
Outperforms state-of-the-art CFR methods in experiments
Effective learning of regret update policies through reinforcement learning
Abstract
Counterfactual regret minimization (CFR) is a popular method to deal with decision-making problems of two-player zero-sum games with imperfect information. Unlike existing studies that mostly explore for solving larger scale problems or accelerating solution efficiency, we propose a framework, RLCFR, which aims at improving the generalization ability of the CFR method. In the RLCFR, the game strategy is solved by the CFR in a reinforcement learning framework. And the dynamic procedure of iterative interactive strategy updating is modeled as a Markov decision process (MDP). Our method, RLCFR, then learns a policy to select the appropriate way of regret updating in the process of iteration. In addition, a stepwise reward function is formulated to learn the action policy, which is proportional to how well the iteration strategy is at each step. Extensive experimental results on various…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Games · Reinforcement Learning in Robotics · Advanced Bandit Algorithms Research
