Gaussian Process Aggregation for Root-Parallel Monte Carlo Tree Search with Continuous Actions
By: Junlin Xiao , Victor-Alexandru Darvariu , Bruno Lacerda and more
Potential Business Impact:
Helps robots learn faster by guessing good moves.
Monte Carlo Tree Search is a cornerstone algorithm for online planning, and its root-parallel variant is widely used when wall clock time is limited but best performance is desired. In environments with continuous action spaces, how to best aggregate statistics from different threads is an important yet underexplored question. In this work, we introduce a method that uses Gaussian Process Regression to obtain value estimates for promising actions that were not trialed in the environment. We perform a systematic evaluation across 6 different domains, demonstrating that our approach outperforms existing aggregation strategies while requiring a modest increase in inference time.
Similar Papers
Action-Gradient Monte Carlo Tree Search for Non-Parametric Continuous (PO)MDPs
Artificial Intelligence
Helps robots learn to make better choices.
Tree-OPO: Off-policy Monte Carlo Tree-Guided Advantage Optimization for Multistep Reasoning
Artificial Intelligence
Teaches computers to learn better from choices.
Deep Gaussian Process Proximal Policy Optimization
Machine Learning (CS)
Helps robots learn safely and explore better.