Membership Inference on LLMs in the Wild
By: Jiatong Yi, Yanyang Li
Potential Business Impact:
Finds if private info was in AI's training.
Membership Inference Attacks (MIAs) act as a crucial auditing tool for the opaque training data of Large Language Models (LLMs). However, existing techniques predominantly rely on inaccessible model internals (e.g., logits) or suffer from poor generalization across domains in strict black-box settings where only generated text is available. In this work, we propose SimMIA, a robust MIA framework tailored for this text-only regime by leveraging an advanced sampling strategy and scoring mechanism. Furthermore, we present WikiMIA-25, a new benchmark curated to evaluate MIA performance on modern proprietary LLMs. Experiments demonstrate that SimMIA achieves state-of-the-art results in the black-box setting, rivaling baselines that exploit internal model information.
Similar Papers
Membership Inference Attacks on Large-Scale Models: A Survey
Machine Learning (CS)
Finds if your private info trained AI.
On the Effectiveness of Membership Inference in Targeted Data Extraction from Large Language Models
Machine Learning (CS)
Stops AI from accidentally sharing private secrets.
Imitative Membership Inference Attack
Cryptography and Security
Finds if private data was used to train AI.