Online Robust Planning under Model Uncertainty: A Sample-Based Approach
By: Tamir Shazman , Idan Lev-Yehudi , Ron Benchetit and more
Potential Business Impact:
Helps robots make safe choices with bad information.
Online planning in Markov Decision Processes (MDPs) enables agents to make sequential decisions by simulating future trajectories from the current state, making it well-suited for large-scale or dynamic environments. Sample-based methods such as Sparse Sampling and Monte Carlo Tree Search (MCTS) are widely adopted for their ability to approximate optimal actions using a generative model. However, in practical settings, the generative model is often learned from limited data, introducing approximation errors that can degrade performance or lead to unsafe behaviors. To address these challenges, Robust MDPs (RMDPs) offer a principled framework for planning under model uncertainty, yet existing approaches are typically computationally intensive and not suited for real-time use. In this work, we introduce Robust Sparse Sampling (RSS), the first online planning algorithm for RMDPs with finite-sample theoretical performance guarantees. Unlike Sparse Sampling, which estimates the nominal value function, RSS computes a robust value function by leveraging the efficiency and theoretical properties of Sample Average Approximation (SAA), enabling tractable robust policy computation in online settings. RSS is applicable to infinite or continuous state spaces, and its sample and computational complexities are independent of the state space size. We provide theoretical performance guarantees and empirically show that RSS outperforms standard Sparse Sampling in environments with uncertain dynamics.
Similar Papers
Online Robust Planning under Model Uncertainty: A Sample-Based Approach
Artificial Intelligence
Helps robots make safe choices with uncertain information.
Sparse Offline Reinforcement Learning with Corruption Robustness
Machine Learning (Stat)
Helps computers learn from bad data.
Provably Efficient Sample Complexity for Robust CMDP
Machine Learning (CS)
Teaches robots to be safe and smart.