Rethinking Robustness: A New Approach to Evaluating Feature Attribution Methods
By: Panagiota Kiourti , Anu Singh , Preeti Duraipandian and more
Potential Business Impact:
Makes AI explanations more trustworthy and accurate.
This paper studies the robustness of feature attribution methods for deep neural networks. It challenges the current notion of attributional robustness that largely ignores the difference in the model's outputs and introduces a new way of evaluating the robustness of attribution methods. Specifically, we propose a new definition of similar inputs, a new robustness metric, and a novel method based on generative adversarial networks to generate these inputs. In addition, we present a comprehensive evaluation with existing metrics and state-of-the-art attribution methods. Our findings highlight the need for a more objective metric that reveals the weaknesses of an attribution method rather than that of the neural network, thus providing a more accurate evaluation of the robustness of attribution methods.
Similar Papers
Attribution Explanations for Deep Neural Networks: A Theoretical Perspective
Machine Learning (CS)
Makes AI decisions easier to understand.
Towards Trustworthy Wi-Fi Sensing: Systematic Evaluation of Deep Learning Model Robustness to Adversarial Attacks
Machine Learning (CS)
Makes wireless sensing safer from hacking.
Nonparametric Data Attribution for Diffusion Models
Machine Learning (CS)
Shows which training pictures made new art.