Accelerating Nash Equilibrium Convergence in Monte Carlo Settings Through Counterfactual Value Based Fictitious Play
Ju Qi, Falin Hei, Ting Feng, Dengbing Yi, Zhemei Fang, Yunfeng Luo

TL;DR
This paper introduces MCCFVFP, a new Monte Carlo-based algorithm that accelerates convergence in large-scale imperfect information games by combining CFR and fictitious play, achieving 20-50% faster results.
Contribution
The paper presents MCCFVFP, a novel Monte Carlo algorithm that integrates CFR with fictitious play to improve convergence speed in complex imperfect information games.
Findings
Achieved 20-50% faster convergence than existing MCCFR variants.
Effective in large-scale games like poker with many dominated strategies.
Demonstrated significant improvements through experiments on test games.
Abstract
Counterfactual Regret Minimization (CFR) and its variants are widely recognized as effective algorithms for solving extensive-form imperfect information games. Recently, many improvements have been focused on enhancing the convergence speed of the CFR algorithm. However, most of these variants are not applicable under Monte Carlo (MC) conditions, making them unsuitable for training in large-scale games. We introduce a new MC-based algorithm for solving extensive-form imperfect information games, called MCCFVFP (Monte Carlo Counterfactual Value-Based Fictitious Play). MCCFVFP combines CFR's counterfactual value calculations with fictitious play's best response strategy, leveraging the strengths of fictitious play to gain significant advantages in games with a high proportion of dominated strategies. Experimental results show that MCCFVFP achieved convergence speeds approximately…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsArtificial Intelligence in Games · Explainable Artificial Intelligence (XAI) · Machine Learning and Data Classification
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
