Model Predictive Adversarial Imitation Learning for Planning from Observation
By: Tyler Han , Yanda Bao , Bhaumik Mehta and more
Potential Business Impact:
Teaches robots to plan and learn from watching.
Human demonstration data is often ambiguous and incomplete, motivating imitation learning approaches that also exhibit reliable planning behavior. A common paradigm to perform planning-from-demonstration involves learning a reward function via Inverse Reinforcement Learning (IRL) then deploying this reward via Model Predictive Control (MPC). Towards unifying these methods, we derive a replacement of the policy in IRL with a planning-based agent. With connections to Adversarial Imitation Learning, this formulation enables end-to-end interactive learning of planners from observation-only demonstrations. In addition to benefits in interpretability, complexity, and safety, we study and observe significant improvements on sample efficiency, out-of-distribution generalization, and robustness. The study includes evaluations in both simulated control benchmarks and real-world navigation experiments using few-to-single observation-only demonstrations.
Similar Papers
Symmetry-Guided Multi-Agent Inverse Reinforcement Learnin
Robotics
Robots learn better with less practice.
Symmetry-Guided Multi-Agent Inverse Reinforcement Learning
Robotics
Robots learn better with fewer examples.
Multi-Agent Inverse Q-Learning from Demonstrations
Multiagent Systems
Teaches robots to work together better.