Imputation Uncertainty in Interpretable Machine Learning Methods
By: Pegah Golchian, Marvin N. Wright
In real data, missing values occur frequently, which affects the interpretation with interpretable machine learning (IML) methods. Recent work considers bias and shows that model explanations may differ between imputation methods, while ignoring additional imputation uncertainty and its influence on variance and confidence intervals. We therefore compare the effects of different imputation methods on the confidence interval coverage probabilities of the IML methods permutation feature importance, partial dependence plots and Shapley values. We show that single imputation leads to underestimation of variance and that, in most cases, only multiple imputation is close to nominal coverage.
Similar Papers
Beyond Accuracy: An Empirical Study of Uncertainty Estimation in Imputation
Databases
Makes computer guesses about missing info more trustworthy.
Prediction-Powered Inference with Imputed Covariates and Nonuniform Sampling
Methodology
Makes computer predictions more trustworthy for science.
An Interdisciplinary and Cross-Task Review on Missing Data Imputation
Machine Learning (Stat)
Fixes broken data for better computer decisions.