Score: 1

Multi-source Multimodal Progressive Domain Adaption for Audio-Visual Deception Detection

Published: August 18, 2025 | arXiv ID: 2508.12842v1

By: Ronghao Lin , Sijie Mai , Ying Zeng and more

Potential Business Impact:

Helps computers spot lies in videos and audio.

This paper presents the winning approach for the 1st MultiModal Deception Detection (MMDD) Challenge at the 1st Workshop on Subtle Visual Computing (SVC). Aiming at the domain shift issue across source and target domains, we propose a Multi-source Multimodal Progressive Domain Adaptation (MMPDA) framework that transfers the audio-visual knowledge from diverse source domains to the target domain. By gradually aligning source and the target domain at both feature and decision levels, our method bridges domain shifts across diverse multimodal datasets. Extensive experiments demonstrate the effectiveness of our approach securing Top-2 place. Our approach reaches 60.43% on accuracy and 56.99\% on F1-score on competition stage 2, surpassing the 1st place team by 5.59% on F1-score and the 3rd place teams by 6.75% on accuracy. Our code is available at https://github.com/RH-Lin/MMPDA.

Country of Origin
🇨🇳 China

Repos / Data Links

Page Count
7 pages

Category
Computer Science:
CV and Pattern Recognition