A Dynamical Systems Framework for Reinforcement Learning Safety and Robustness Verification
By: Ahmed Nasir, Abdelhafid Zenati
Potential Business Impact:
Shows how smart programs can be safe.
The application of reinforcement learning to safety-critical systems is limited by the lack of formal methods for verifying the robustness and safety of learned policies. This paper introduces a novel framework that addresses this gap by analyzing the combination of an RL agent and its environment as a discrete-time autonomous dynamical system. By leveraging tools from dynamical systems theory, specifically the Finite-Time Lyapunov Exponent (FTLE), we identify and visualize Lagrangian Coherent Structures (LCS) that act as the hidden "skeleton" governing the system's behavior. We demonstrate that repelling LCS function as safety barriers around unsafe regions, while attracting LCS reveal the system's convergence properties and potential failure modes, such as unintended "trap" states. To move beyond qualitative visualization, we introduce a suite of quantitative metrics, Mean Boundary Repulsion (MBR), Aggregated Spurious Attractor Strength (ASAS), and Temporally-Aware Spurious Attractor Strength (TASAS), to formally measure a policy's safety margin and robustness. We further provide a method for deriving local stability guarantees and extend the analysis to handle model uncertainty. Through experiments in both discrete and continuous control environments, we show that this framework provides a comprehensive and interpretable assessment of policy behavior, successfully identifying critical flaws in policies that appear successful based on reward alone.
Similar Papers
A Review On Safe Reinforcement Learning Using Lyapunov and Barrier Functions
Systems and Control
Keeps smart machines from making dangerous mistakes.
A Review On Safe Reinforcement Learning Using Lyapunov and Barrier Functions
Systems and Control
Keeps smart robots from making dangerous mistakes.
Breaking the Safety-Capability Tradeoff: Reinforcement Learning with Verifiable Rewards Maintains Safety Guardrails in LLMs
Machine Learning (CS)
Trains AI to be smart and safe together.