The Tail Tells All: Estimating Model-Level Membership Inference Vulnerability Without Reference Models
By: Euodia Dodd , Nataša Krčo , Igor Shilov and more
Potential Business Impact:
Finds AI privacy risks without needing many AI copies.
Membership inference attacks (MIAs) have emerged as the standard tool for evaluating the privacy risks of AI models. However, state-of-the-art attacks require training numerous, often computationally expensive, reference models, limiting their practicality. We present a novel approach for estimating model-level vulnerability, the TPR at low FPR, to membership inference attacks without requiring reference models. Empirical analysis shows loss distributions to be asymmetric and heavy-tailed and suggests that most points at risk from MIAs have moved from the tail (high-loss region) to the head (low-loss region) of the distribution after training. We leverage this insight to propose a method to estimate model-level vulnerability from the training and testing distribution alone: using the absence of outliers from the high-loss region as a predictor of the risk. We evaluate our method, the TNR of a simple loss attack, across a wide range of architectures and datasets and show it to accurately estimate model-level vulnerability to the SOTA MIA attack (LiRA). We also show our method to outperform both low-cost (few reference models) attacks such as RMIA and other measures of distribution difference. We finally evaluate the use of non-linear functions to evaluate risk and show the approach to be promising to evaluate the risk in large-language models.
Similar Papers
Empirical Comparison of Membership Inference Attacks in Deep Transfer Learning
Machine Learning (CS)
Finds best ways to check if AI learned private info.
Empirical Comparison of Membership Inference Attacks in Deep Transfer Learning
Machine Learning (CS)
Finds ways hackers steal private info from AI.
Membership Inference Attacks Beyond Overfitting
Cryptography and Security
Protects private data used to train smart programs.