Multi-source Multimodal Progressive Domain Adaption for Audio-Visual Deception Detection
By: Ronghao Lin , Sijie Mai , Ying Zeng and more
Potential Business Impact:
Helps computers spot lies in videos and audio.
This paper presents the winning approach for the 1st MultiModal Deception Detection (MMDD) Challenge at the 1st Workshop on Subtle Visual Computing (SVC). Aiming at the domain shift issue across source and target domains, we propose a Multi-source Multimodal Progressive Domain Adaptation (MMPDA) framework that transfers the audio-visual knowledge from diverse source domains to the target domain. By gradually aligning source and the target domain at both feature and decision levels, our method bridges domain shifts across diverse multimodal datasets. Extensive experiments demonstrate the effectiveness of our approach securing Top-2 place. Our approach reaches 60.43% on accuracy and 56.99\% on F1-score on competition stage 2, surpassing the 1st place team by 5.59% on F1-score and the 3rd place teams by 6.75% on accuracy. Our code is available at https://github.com/RH-Lin/MMPDA.
Similar Papers
SVC 2025: the First Multimodal Deception Detection Challenge
CV and Pattern Recognition
Teaches computers to spot lies in voices and faces.
Vision-aware Multimodal Prompt Tuning for Uploadable Multi-source Few-shot Domain Adaptation
CV and Pattern Recognition
Lets computers learn from less data, faster.
Denoising and Alignment: Rethinking Domain Generalization for Multimodal Face Anti-Spoofing
CV and Pattern Recognition
Stops fake faces from tricking security cameras.