Score: 0

Safe Planning and Policy Optimization via World Model Learning

Published: June 5, 2025 | arXiv ID: 2506.04828v1

By: Artem Latyshev, Gregory Gorbov, Aleksandr I. Panov

Potential Business Impact:

Keeps robots safe while they learn new tasks.

Business Areas:

Simulation Software

Reinforcement Learning (RL) applications in real-world scenarios must prioritize safety and reliability, which impose strict constraints on agent behavior. Model-based RL leverages predictive world models for action planning and policy optimization, but inherent model inaccuracies can lead to catastrophic failures in safety-critical settings. We propose a novel model-based RL framework that jointly optimizes task performance and safety. To address world model errors, our method incorporates an adaptive mechanism that dynamically switches between model-based planning and direct policy execution. We resolve the objective mismatch problem of traditional model-based approaches using an implicit world model. Furthermore, our framework employs dynamic safety thresholds that adapt to the agent's evolving capabilities, consistently selecting actions that surpass safe policy suggestions in both performance and safety. Experiments demonstrate significant improvements over non-adaptive methods, showing that our approach optimizes safety and performance simultaneously rather than merely meeting minimum safety requirements. The proposed framework achieves robust performance on diverse safety-critical continuous control tasks, outperforming existing methods.

World Models for Anomaly Detection during Model-Based Reinforcement Learning Inference

Robotics

Keeps robots safe by stopping them when they're unsure.

4 Mar 2025 1

92%

Policy-Driven World Model Adaptation for Robust Offline Model-based Reinforcement Learning

Machine Learning (CS)

Teaches robots to learn from past mistakes.

19 May 2025 1

91%

Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics

Robotics

Robots learn to do new jobs by practicing in their minds.

17 Jan 2025 0

View PDF Login to Bookmark

Country of Origin

🇷🇺 Russian Federation

Page Count

9 pages

Safe Planning and Policy Optimization via World Model Learning

Keeps robots safe while they learn new tasks.

Technical Abstract

World Models for Anomaly Detection during Model-Based Reinforcement Learning Inference

Policy-Driven World Model Adaptation for Robust Offline Model-based Reinforcement Learning

Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics