Model Evaluation in the Dark: Robust Classifier Metrics with Missing Labels
By: Danial Dervovic, Michael Cashmore
Potential Business Impact:
Fixes computer models when answers are missing.
Missing data in supervised learning is well-studied, but the specific issue of missing labels during model evaluation has been overlooked. Ignoring samples with missing values, a common solution, can introduce bias, especially when data is Missing Not At Random (MNAR). We propose a multiple imputation technique for evaluating classifiers using metrics such as precision, recall, and ROC-AUC. This method not only offers point estimates but also a predictive distribution for these quantities when labels are missing. We empirically show that the predictive distribution's location and shape are generally correct, even in the MNAR regime. Moreover, we establish that this distribution is approximately Gaussian and provide finite-sample convergence bounds. Additionally, a robustness proof is presented, confirming the validity of the approximation under a realistic error model.
Similar Papers
Prediction Models That Learn to Avoid Missing Values
Machine Learning (CS)
Helps computers guess answers when data is missing.
Learning Accurate Models on Incomplete Data with Minimal Imputation
Machine Learning (CS)
Fixes messy data faster for smarter computers.
Comparison of Parametric versus Machine-learning Multiple Imputation in Clinical Trials with Missing Continuous Outcomes
Methodology
Helps doctors trust study results with missing info.