OffSim: Offline Simulator for Model-based Offline Inverse Reinforcement Learning
By: Woo-Jin Ahn , Sang-Ryul Baek , Yong-Jun Lee and more
Potential Business Impact:
Teaches robots without real-world practice.
Reinforcement learning algorithms typically utilize an interactive simulator (i.e., environment) with a predefined reward function for policy training. Developing such simulators and manually defining reward functions, however, is often time-consuming and labor-intensive. To address this, we propose an Offline Simulator (OffSim), a novel model-based offline inverse reinforcement learning (IRL) framework, to emulate environmental dynamics and reward structure directly from expert-generated state-action trajectories. OffSim jointly optimizes a high-entropy transition model and an IRL-based reward function to enhance exploration and improve the generalizability of the learned reward. Leveraging these learned components, OffSim can subsequently train a policy offline without further interaction with the real environment. Additionally, we introduce OffSim$^+$, an extension that incorporates a marginal reward for multi-dataset settings to enhance exploration. Extensive MuJoCo experiments demonstrate that OffSim achieves substantial performance gains over existing offline IRL methods, confirming its efficacy and robustness.
Similar Papers
A Recipe for Efficient Sim-to-Real Transfer in Manipulation with Online Imitation-Pretrained World Models
Robotics
Teaches robots to learn from practice, not just examples.
Distributional Inverse Reinforcement Learning
Machine Learning (CS)
Learns how to do things by watching experts.
Online Optimization for Offline Safe Reinforcement Learning
Machine Learning (CS)
Teaches robots to do tasks safely and well.