Parallelizing Tree Search with Twice Sequential Monte Carlo
By: Yaniv Oren , Joery A. de Vries , Pascal R. van der Vaart and more
Potential Business Impact:
Makes AI learn faster and better.
Model-based reinforcement learning (RL) methods that leverage search are responsible for many milestone breakthroughs in RL. Sequential Monte Carlo (SMC) recently emerged as an alternative to the Monte Carlo Tree Search (MCTS) algorithm which drove these breakthroughs. SMC is easier to parallelize and more suitable to GPU acceleration. However, it also suffers from large variance and path degeneracy which prevent it from scaling well with increased search depth, i.e., increased sequential compute. To address these problems, we introduce Twice Sequential Monte Carlo Tree Search (TSMCTS). Across discrete and continuous environments TSMCTS outperforms the SMC baseline as well as a popular modern version of MCTS. Through variance reduction and mitigation of path degeneracy, TSMCTS scales favorably with sequential compute while retaining the properties that make SMC natural to parallelize.
Similar Papers
Trust-Region Twisted Policy Improvement
Machine Learning (CS)
Makes computer games smarter and faster to learn.
Array-Based Monte Carlo Tree Search
Artificial Intelligence
Makes computer games think and play faster.
Tail-Risk-Safe Monte Carlo Tree Search under PAC-Level Guarantees
Machine Learning (CS)
Makes smart programs avoid dangerous bad choices.