Towards Unveiling Predictive Uncertainty Vulnerabilities in the Context of the Right to Be Forgotten
By: Wei Qian , Chenxu Zhao , Yangyi Li and more
Potential Business Impact:
Tricks AI into being unsure about its answers.
Currently, various uncertainty quantification methods have been proposed to provide certainty and probability estimates for deep learning models' label predictions. Meanwhile, with the growing demand for the right to be forgotten, machine unlearning has been extensively studied as a means to remove the impact of requested sensitive data from a pre-trained model without retraining the model from scratch. However, the vulnerabilities of such generated predictive uncertainties with regard to dedicated malicious unlearning attacks remain unexplored. To bridge this gap, for the first time, we propose a new class of malicious unlearning attacks against predictive uncertainties, where the adversary aims to cause the desired manipulations of specific predictive uncertainty results. We also design novel optimization frameworks for our attacks and conduct extensive experiments, including black-box scenarios. Notably, our extensive experiments show that our attacks are more effective in manipulating predictive uncertainties than traditional attacks that focus on label misclassifications, and existing defenses against conventional attacks are ineffective against our attacks.
Similar Papers
Inducing Uncertainty for Test-Time Privacy
Machine Learning (CS)
Makes AI forget data, even when it tries.
When Forgetting Triggers Backdoors: A Clean Unlearning Attack
Cryptography and Security
Makes AI forget things to trick it.
How Secure is Forgetting? Linking Machine Unlearning to Machine Learning Attacks
Cryptography and Security
Removes bad data from smart computer brains.