Membership Inference Attacks on Large-Scale Models: A Survey
By: Hengyu Wu, Yang Cao
Potential Business Impact:
Finds if your private info trained AI.
As large-scale models such as Large Language Models (LLMs) and Large Multimodal Models (LMMs) see increasing deployment, their privacy risks remain underexplored. Membership Inference Attacks (MIAs), which reveal whether a data point was used in training the target model, are an important technique for exposing or assessing privacy risks and have been shown to be effective across diverse machine learning algorithms. However, despite extensive studies on MIAs in classic models, there remains a lack of systematic surveys addressing their effectiveness and limitations in large-scale models. To address this gap, we provide the first comprehensive review of MIAs targeting LLMs and LMMs, analyzing attacks by model type, adversarial knowledge, and strategy. Unlike prior surveys, we further examine MIAs across multiple stages of the model pipeline, including pre-training, fine-tuning, alignment, and Retrieval-Augmented Generation (RAG). Finally, we identify open challenges and propose future research directions for strengthening privacy resilience in large-scale models.
Similar Papers
On the Effectiveness of Membership Inference in Targeted Data Extraction from Large Language Models
Machine Learning (CS)
Stops AI from accidentally sharing private secrets.
Membership Inference Attacks Beyond Overfitting
Cryptography and Security
Protects private data used to train smart programs.
Empirical Comparison of Membership Inference Attacks in Deep Transfer Learning
Machine Learning (CS)
Finds best ways to check if AI learned private info.