Score: 1

Unraveling Hidden Representations: A Multi-Modal Layer Analysis for Better Synthetic Content Forensics

Published: August 1, 2025 | arXiv ID: 2508.00784v1

By: Tom Or, Omri Azencot

Potential Business Impact:

Spots fake pictures and sounds fast.

Generative models achieve remarkable results in multiple data domains, including images and texts, among other examples. Unfortunately, malicious users exploit synthetic media for spreading misinformation and disseminating deepfakes. Consequently, the need for robust and stable fake detectors is pressing, especially when new generative models appear everyday. While the majority of existing work train classifiers that discriminate between real and fake information, such tools typically generalize only within the same family of generators and data modalities, yielding poor results on other generative classes and data domains. Towards a universal classifier, we propose the use of large pre-trained multi-modal models for the detection of generative content. Effectively, we show that the latent code of these models naturally captures information discriminating real from fake. Building on this observation, we demonstrate that linear classifiers trained on these features can achieve state-of-the-art results across various modalities, while remaining computationally efficient, fast to train, and effective even in few-shot settings. Our work primarily focuses on fake detection in audio and images, achieving performance that surpasses or matches that of strong baseline methods.

Toward Generalized Detection of Synthetic Media: Limitations, Challenges, and the Path to Multimodal Solutions

CV and Pattern Recognition

Finds fake pictures and videos made by computers.

14 Nov 2025 0

89%

Could AI Trace and Explain the Origins of AI-Generated Images and Text?

Computation and Language

Finds fake AI pictures and writing.

5 Apr 2025 1

89%

Can GPT tell us why these images are synthesized? Empowering Multimodal Large Language Models for Forensics

CV and Pattern Recognition

Finds fake images and shows how they were made.

16 Apr 2025 1

View PDF Login to Bookmark

Country of Origin

🇮🇱 Israel

Page Count

24 pages

Unraveling Hidden Representations: A Multi-Modal Layer Analysis for Better Synthetic Content Forensics

Spots fake pictures and sounds fast.

Technical Abstract

Toward Generalized Detection of Synthetic Media: Limitations, Challenges, and the Path to Multimodal Solutions

Could AI Trace and Explain the Origins of AI-Generated Images and Text?

Can GPT tell us why these images are synthesized? Empowering Multimodal Large Language Models for Forensics