Deep (Predictive) Discounted Counterfactual Regret Minimization
By: Hang Xu , Kai Li , Haobo Fu and more
Potential Business Impact:
Teaches computers to play complex games better.
Counterfactual regret minimization (CFR) is a family of algorithms for effectively solving imperfect-information games. To enhance CFR's applicability in large games, researchers use neural networks to approximate its behavior. However, existing methods are mainly based on vanilla CFR and struggle to effectively integrate more advanced CFR variants. In this work, we propose an efficient model-free neural CFR algorithm, overcoming the limitations of existing methods in approximating advanced CFR variants. At each iteration, it collects variance-reduced sampled advantages based on a value network, fits cumulative advantages by bootstrapping, and applies discounting and clipping operations to simulate the update mechanisms of advanced CFR variants. Experimental results show that, compared with model-free neural algorithms, it exhibits faster convergence in typical imperfect-information games and demonstrates stronger adversarial performance in a large poker game.
Similar Papers
Robust Deep Monte Carlo Counterfactual Regret Minimization: Addressing Theoretical Risks in Neural Fictitious Self-Play
Artificial Intelligence
Makes AI better at complex games by adapting strategies.
Asynchronous Predictive Counterfactual Regret Minimization$^+$ Algorithm in Solving Extensive-Form Games
Machine Learning (CS)
Makes game AI smarter and more reliable.
SpinGPT: A Large-Language-Model Approach to Playing Poker Correctly
Machine Learning (CS)
AI learns to play three-player poker better.