Trajectory First: A Curriculum for Discovering Diverse Policies
By: Cornelius V. Braun, Sayantan Auddy, Marc Toussaint
Potential Business Impact:
Teaches robots many ways to do jobs.
Being able to solve a task in diverse ways makes agents more robust to task variations and less prone to local optima. In this context, constrained diversity optimization has emerged as a powerful reinforcement learning (RL) framework to train a diverse set of agents in parallel. However, existing constrained-diversity RL methods often under-explore in complex tasks such as robotic manipulation, leading to a lack in policy diversity. To improve diversity optimization in RL, we therefore propose a curriculum that first explores at the trajectory level before learning step-based policies. In our empirical evaluation, we provide novel insights into the shortcoming of skill-based diversity optimization, and demonstrate empirically that our curriculum improves the diversity of the learned skills.
Similar Papers
Causally Aligned Curriculum Learning
Machine Learning (CS)
Teaches robots to learn faster with tricky problems.
Probabilistic Curriculum Learning for Goal-Based Reinforcement Learning
Machine Learning (CS)
Teaches robots to learn new tasks by themselves.
Parental Guidance: Efficient Lifelong Learning through Evolutionary Distillation
Robotics
Robots learn many skills by copying and improving.