Measuring Variable Importance via Accumulated Local Effects
By: Jingyu Zhu, Daniel W. Apley
A shortcoming of black-box supervised learning models is their lack of interpretability or transparency. To facilitate interpretation, post-hoc global variable importance measures (VIMs) are widely used to assign to each predictor or input variable a numerical score that represents the extent to which that predictor impacts the fitted model's response predictions across the training data. It is well known that the most common existing VIMs, namely marginal Shapley and marginal permutation-based methods, can produce unreliable results if the predictors are highly correlated, because they require extrapolation of the response at predictor values that fall far outside the training data. Conditional versions of Shapley and permutation VIMs avoid or reduce the extrapolation but can substantially deflate the importance of correlated predictors. For the related goal of visualizing the effects of each predictor when strong predictor correlation is present, accumulated local effects (ALE) plots were recently introduced and have been widely adopted. This paper presents a new VIM approach based on ALE concepts that avoids both the extrapolation and the VIM deflation problems when predictors are correlated. We contrast, both theoretically and numerically, ALE VIMs with Shapley and permutation VIMs. Our results indicate that ALE VIMs produce similar variable importance rankings as Shapley and permutation VIMs when predictor correlations are mild and more reliable rankings when correlations are strong. An additional advantage is that ALE VIMs are far less computationally expensive.
Similar Papers
A principled approach for comparing Variable Importance
Methodology
Helps AI know which clues matter most.
Marginal and Conditional Importance Measures from Machine Learning Models and Their Relationship with Conditional Average Treatment Effect
Machine Learning (Stat)
Shows which data parts matter most to AI.
A principled approach for comparing Variable Importance
Methodology
Makes AI show which parts are most important.