Lost in Modality: Evaluating the Effectiveness of Text-Based Membership Inference Attacks on Large Multimodal Models
By: Ziyi Tong, Feifei Sun, Le Minh Nguyen
Potential Business Impact:
Finds if private images were used to train AI.
Large Multimodal Language Models (MLLMs) are emerging as one of the foundational tools in an expanding range of applications. Consequently, understanding training-data leakage in these systems is increasingly critical. Log-probability-based membership inference attacks (MIAs) have become a widely adopted approach for assessing data exposure in large language models (LLMs), yet their effect in MLLMs remains unclear. We present the first comprehensive evaluation of extending these text-based MIA methods to multimodal settings. Our experiments under vision-and-text (V+T) and text-only (T-only) conditions across the DeepSeek-VL and InternVL model families show that in in-distribution settings, logit-based MIAs perform comparably across configurations, with a slight V+T advantage. Conversely, in out-of-distribution settings, visual inputs act as regularizers, effectively masking membership signals.
Similar Papers
On the Effectiveness of Membership Inference in Targeted Data Extraction from Large Language Models
Machine Learning (CS)
Stops AI from accidentally sharing private secrets.
Membership Inference Attacks on Large-Scale Models: A Survey
Machine Learning (CS)
Finds if your private info trained AI.
FiMMIA: scaling semantic perturbation-based membership inference across modalities
Machine Learning (CS)
Finds if private data was used to train AI.