Doctor Rashomon and the UNIVERSE of Madness: Variable Importance with Unobserved Confounding and the Rashomon Effect
By: Jon Donnelly , Srikar Katta , Emanuele Borgonovo and more
Potential Business Impact:
Find hidden important factors even with missing data.
Variable importance (VI) methods are often used for hypothesis generation, feature selection, and scientific validation. In the standard VI pipeline, an analyst estimates VI for a single predictive model with only the observed features. However, the importance of a feature depends heavily on which other variables are included in the model, and essential variables are often omitted from observational datasets. Moreover, the VI estimated for one model is often not the same as the VI estimated for another equally-good model - a phenomenon known as the Rashomon Effect. We address these gaps by introducing UNobservables and Inference for Variable importancE using Rashomon SEts (UNIVERSE). Our approach adapts Rashomon sets - the sets of near-optimal models in a dataset - to produce bounds on the true VI even with missing features. We theoretically guarantee the robustness of our approach, show strong performance on semi-synthetic simulations, and demonstrate its utility in a credit risk task.
Similar Papers
Inference on Local Variable Importance Measures for Heterogeneous Treatment Effects
Methodology
Helps doctors choose best treatments for each person.
Intervention Efficiency and Perturbation Validation Framework: Capacity-Aware and Robust Clinical Model Selection under the Rashomon Effect
Machine Learning (CS)
Helps doctors pick the best computer helper.
"A 6 or a 9?": Ensemble Learning Through the Multiplicity of Performant Models and Explanations
Machine Learning (CS)
Finds best computer answers from many good ones.