Why Goal-Conditioned Reinforcement Learning Works: Relation to Dual Control
By: Nathan P. Lawrence, Ali Mesbah
Potential Business Impact:
Teaches robots to reach any goal.
Goal-conditioned reinforcement learning (RL) concerns the problem of training an agent to maximize the probability of reaching target goal states. This paper presents an analysis of the goal-conditioned setting based on optimal control. In particular, we derive an optimality gap between more classical, often quadratic, objectives and the goal-conditioned reward, elucidating the success of goal-conditioned RL and why classical ``dense'' rewards can falter. We then consider the partially observed Markov decision setting and connect state estimation to our probabilistic reward, further making the goal-conditioned reward well suited to dual control problems. The advantages of goal-conditioned policies are validated on nonlinear and uncertain environments using both RL and predictive control techniques.
Similar Papers
Dense and Diverse Goal Coverage in Multi Goal Reinforcement Learning
Machine Learning (CS)
Teaches robots to explore many goals, not just one.
Autonomous Learning From Success and Failure: Goal-Conditioned Supervised Learning with Negative Feedback
Machine Learning (CS)
Helps robots learn from mistakes, not just wins.
R2L: Reliable Reinforcement Learning: Guaranteed Return & Reliable Policies in Reinforcement Learning
Machine Learning (CS)
Makes smart programs more dependable and safer.