Explainable RL Policies by Distilling to Locally-Specialized Linear Policies with Voronoi State Partitioning
By: Senne Deproost, Dennis Steckelmacher, Ann Nowé
Potential Business Impact:
Makes smart computer brains easy to understand.
Deep Reinforcement Learning is one of the state-of-the-art methods for producing near-optimal system controllers. However, deep RL algorithms train a deep neural network, that lacks transparency, which poses challenges when the controller has to meet regulations, or foster trust. To alleviate this, one could transfer the learned behaviour into a model that is human-readable by design using knowledge distilla- tion. Often this is done with a single model which mimics the original model on average but could struggle in more dynamic situations. A key challenge is that this simpler model should have the right balance be- tween flexibility and complexity or right balance between balance bias and accuracy. We propose a new model-agnostic method to divide the state space into regions where a simplified, human-understandable model can operate in. In this paper, we use Voronoi partitioning to find regions where linear models can achieve similar performance to the original con- troller. We evaluate our approach on a gridworld environment and a classic control task. We observe that our proposed distillation to locally- specialized linear models produces policies that are explainable and show that the distillation matches or even slightly outperforms the black-box policy they are distilled from.
Similar Papers
Refined Policy Distillation: From VLA Generalists to RL Experts
Robotics
Teaches robots to do tasks better than humans.
To Distill or Decide? Understanding the Algorithmic Trade-off in Partially Observable Reinforcement Learning
Machine Learning (CS)
Teaches robots to learn better from hidden information.
Steering Your Diffusion Policy with Latent Space Reinforcement Learning
Robotics
Robots learn to improve by themselves.