Distributional Inverse Reinforcement Learning
By: Feiyang Wu, Ye Zhao, Anqi Wu
Potential Business Impact:
Learns how to do things by watching experts.
We propose a distributional framework for offline Inverse Reinforcement Learning (IRL) that jointly models uncertainty over reward functions and full distributions of returns. Unlike conventional IRL approaches that recover a deterministic reward estimate or match only expected returns, our method captures richer structure in expert behavior, particularly in learning the reward distribution, by minimizing first-order stochastic dominance (FSD) violations and thus integrating distortion risk measures (DRMs) into policy learning, enabling the recovery of both reward distributions and distribution-aware policies. This formulation is well-suited for behavior analysis and risk-aware imitation learning. Empirical results on synthetic benchmarks, real-world neurobehavioral data, and MuJoCo control tasks demonstrate that our method recovers expressive reward representations and achieves state-of-the-art imitation performance.
Similar Papers
Symmetry-Guided Multi-Agent Inverse Reinforcement Learning
Robotics
Robots learn better with fewer examples.
Symmetry-Guided Multi-Agent Inverse Reinforcement Learnin
Robotics
Robots learn better with less practice.
Inverse Reinforcement Learning Using Just Classification and a Few Regressions
Machine Learning (CS)
Teaches robots to learn by watching.