Reinforcement Learning with Anticipation: A Hierarchical Approach for Long-Horizon Tasks
By: Yang Yu
Potential Business Impact:
Helps robots learn long, hard tasks by breaking them down.
Solving long-horizon goal-conditioned tasks remains a significant challenge in reinforcement learning (RL). Hierarchical reinforcement learning (HRL) addresses this by decomposing tasks into more manageable sub-tasks, but the automatic discovery of the hierarchy and the joint training of multi-level policies often suffer from instability and can lack theoretical guarantees. In this paper, we introduce Reinforcement Learning with Anticipation (RLA), a principled and potentially scalable framework designed to address these limitations. The RLA agent learns two synergistic models: a low-level, goal-conditioned policy that learns to reach specified subgoals, and a high-level anticipation model that functions as a planner, proposing intermediate subgoals on the optimal path to a final goal. The key feature of RLA is the training of the anticipation model, which is guided by a principle of value geometric consistency, regularized to prevent degenerate solutions. We present proofs that RLA approaches the globally optimal policy under various conditions, establishing a principled and convergent method for hierarchical planning and execution in long-horizon goal-conditioned tasks.
Similar Papers
Hierarchical Reinforcement Learning in Multi-Goal Spatial Navigation with Autonomous Mobile Robots
Artificial Intelligence
Robots learn to navigate complex places faster.
Hierarchical Reinforcement Learning with Targeted Causal Interventions
Machine Learning (CS)
Teaches robots to learn tasks faster.
Hierarchical Reinforcement Learning with Uncertainty-Guided Diffusional Subgoals
Machine Learning (CS)
Teaches robots to learn complex tasks faster.