The Surprising Difficulty of Search in Model-Based Reinforcement Learning
By: Wei-Di Chang , Mikael Henaff , Brandon Amos and more
Potential Business Impact:
Makes smart computer games learn faster and better.
This paper investigates search in model-based reinforcement learning (RL). Conventional wisdom holds that long-term predictions and compounding errors are the primary obstacles for model-based RL. We challenge this view, showing that search is not a plug-and-play replacement for a learned policy. Surprisingly, we find that search can harm performance even when the model is highly accurate. Instead, we show that mitigating distribution shift matters more than improving model or value function accuracy. Building on this insight, we identify key techniques for enabling effective search, achieving state-of-the-art performance across multiple popular benchmark domains.
Similar Papers
Reinforcement Learning for Long-Horizon Multi-Turn Search Agents
Computation and Language
AI learns better by trying and failing.
A Comprehensive Survey on Reinforcement Learning-based Agentic Search: Foundations, Roles, Optimizations, Evaluations, and Applications
Artificial Intelligence
Teaches computers to find better answers online.
Is Exploration or Optimization the Problem for Deep Reinforcement Learning?
Machine Learning (CS)
Finds how well computer learning can improve.