Unsupervised Skill Discovery through Skill Regions Differentiation
By: Ting Xiao , Jiakun Zheng , Rushuai Yang and more
Potential Business Impact:
Helps robots learn new skills faster and better.
Unsupervised Reinforcement Learning (RL) aims to discover diverse behaviors that can accelerate the learning of downstream tasks. Previous methods typically focus on entropy-based exploration or empowerment-driven skill learning. However, entropy-based exploration struggles in large-scale state spaces (e.g., images), and empowerment-based methods with Mutual Information (MI) estimations have limitations in state exploration. To address these challenges, we propose a novel skill discovery objective that maximizes the deviation of the state density of one skill from the explored regions of other skills, encouraging inter-skill state diversity similar to the initial MI objective. For state-density estimation, we construct a novel conditional autoencoder with soft modularization for different skill policies in high-dimensional space. Meanwhile, to incentivize intra-skill exploration, we formulate an intrinsic reward based on the learned autoencoder that resembles count-based exploration in a compact latent space. Through extensive experiments in challenging state and image-based tasks, we find our method learns meaningful skills and achieves superior performance in various downstream tasks.
Similar Papers
Periodic Skill Discovery
Machine Learning (CS)
Teaches robots to learn new skills by themselves.
Periodic Skill Discovery
Machine Learning (CS)
Teaches robots to learn new, repeating movements.
Offline Reinforcement Learning with Discrete Diffusion Skills
Machine Learning (CS)
Teaches robots complex tasks with fewer steps.