Score: 0

Model Inversion Attacks on Vision-Language Models: Do They Leak What They Learn?

Published: August 6, 2025 | arXiv ID: 2508.04097v1

By: Ngoc-Bao Nguyen , Sy-Tuyen Ho , Koh Jun Hao and more

Potential Business Impact:

Steals private pictures from smart AI.

Model inversion (MI) attacks pose significant privacy risks by reconstructing private training data from trained neural networks. While prior works have focused on conventional unimodal DNNs, the vulnerability of vision-language models (VLMs) remains underexplored. In this paper, we conduct the first study to understand VLMs' vulnerability in leaking private visual training data. To tailored for VLMs' token-based generative nature, we propose a suite of novel token-based and sequence-based model inversion strategies. Particularly, we propose Token-based Model Inversion (TMI), Convergent Token-based Model Inversion (TMI-C), Sequence-based Model Inversion (SMI), and Sequence-based Model Inversion with Adaptive Token Weighting (SMI-AW). Through extensive experiments and user study on three state-of-the-art VLMs and multiple datasets, we demonstrate, for the first time, that VLMs are susceptible to training data leakage. The experiments show that our proposed sequence-based methods, particularly SMI-AW combined with a logit-maximization loss based on vocabulary representation, can achieve competitive reconstruction and outperform token-based methods in attack accuracy and visual similarity. Importantly, human evaluation of the reconstructed images yields an attack accuracy of 75.31\%, underscoring the severity of model inversion threats in VLMs. Notably we also demonstrate inversion attacks on the publicly released VLMs. Our study reveals the privacy vulnerability of VLMs as they become increasingly popular across many applications such as healthcare and finance.

Are Neuro-Inspired Multi-Modal Vision-Language Models Resilient to Membership Inference Privacy Leakage?

CV and Pattern Recognition

Makes AI models harder to steal private data from.

24 Nov 2025 1

91%

An Automated, Scalable Machine Learning Model Inversion Assessment Pipeline

Cryptography and Security

Protects secret data used to train smart computer programs.

4 Sep 2025 1

91%

Revisiting Model Inversion Evaluation: From Misleading Standards to Reliable Privacy Assessment

Machine Learning (CS)

Finds fake privacy leaks in AI.

6 May 2025 2

View PDF Login to Bookmark

Country of Origin

🇸🇬 Singapore

Page Count

15 pages

Model Inversion Attacks on Vision-Language Models: Do They Leak What They Learn?

Steals private pictures from smart AI.

Technical Abstract

Are Neuro-Inspired Multi-Modal Vision-Language Models Resilient to Membership Inference Privacy Leakage?

An Automated, Scalable Machine Learning Model Inversion Assessment Pipeline

Revisiting Model Inversion Evaluation: From Misleading Standards to Reliable Privacy Assessment