A Review On Safe Reinforcement Learning Using Lyapunov and Barrier Functions
By: Dhruv Singh Kushwaha, Zoleikha Abdollahi Biron
Potential Business Impact:
Keeps smart robots from making dangerous mistakes.
Reinforcement learning (RL) has proven to be particularly effective in solving complex decision-making problems for a wide range of applications. From a control theory perspective, RL can be considered as an adaptive optimal control scheme. Lyapunov and barrier functions are the most commonly used certificates to guarantee system stability for a proposed/derived controller and constraint satisfaction guarantees, respectively, in control theoretic approaches. However, compared to theoretical guarantees available in control theoretic methods, RL lacks closed-loop stability of a computed policy and constraint satisfaction guarantees. Safe reinforcement learning refers to a class of constrained problems where the constraint violations lead to partial or complete system failure. The goal of this review is to provide an overview of safe RL techniques using Lyapunov and barrier functions to guarantee this notion of safety discussed (stability of the system in terms of a computed policy and constraint satisfaction during training and deployment). The different approaches employed are discussed in detail along with their shortcomings and benefits to provide critique and possible future research directions. Key motivation for this review is to discuss current theoretical approaches for safety and stability guarantees in RL similar to control theoretic approaches using Lyapunov and barrier functions. The review provides proven potential and promising scope of providing safety guarantees for complex dynamical systems with operational constraints using model-based and model-free RL.
Similar Papers
A Review On Safe Reinforcement Learning Using Lyapunov and Barrier Functions
Systems and Control
Keeps smart machines from making dangerous mistakes.
Stability Enhancement in Reinforcement Learning via Adaptive Control Lyapunov Function
Machine Learning (CS)
Makes robots learn safely without breaking things.
Barrier Function Overrides For Non-Convex Fixed Wing Flight Control and Self-Driving Cars
Robotics
Keeps robots safe while learning new tasks.