ForensicFlow: A Tri-Modal Adaptive Network for Robust Deepfake Detection
By: Mohammad Romani
Potential Business Impact:
Finds fake videos by looking at pictures, patterns, and sounds.
Deepfakes generated by advanced GANs and autoencoders severely threaten information integrity and societal stability. Single-stream CNNs fail to capture multi-scale forgery artifacts across spatial, texture, and frequency domains, limiting robustness and generalization. We introduce the ForensicFlow, a tri-modal forensic framework that synergistically fuses RGB, texture, and frequency evidence for video Deepfake detection. The RGB branch (ConvNeXt-tiny) extracts global visual inconsistencies; the texture branch (Swin Transformer-tiny) detects fine-grained blending artifacts; the frequency branch (CNN + SE) identifies periodic spectral noise. Attention-based temporal pooling dynamically prioritizes high-evidence frames, while adaptive attention fusion balances branch contributions.Trained on Celeb-DF (v2) with Focal Loss, ForensicFlow achieves AUC 0.9752, F1-Score 0.9408, and accuracy 0.9208, outperforming single-stream baselines. Ablation validates branch synergy; Grad-CAM confirms forensic focus. This comprehensive feature fusion provides superior resilience against subtle forgeries.
Similar Papers
A Dual-Branch CNN for Robust Detection of AI-Generated Facial Forgeries
CV and Pattern Recognition
Finds fake faces in pictures better than people.
A Hybrid Deep Learning and Forensic Approach for Robust Deepfake Detection
CV and Pattern Recognition
Finds fake videos by combining clues.
Dual-Branch Convolutional Framework for Spatial and Frequency-Based Image Forgery Detection
Machine Learning (CS)
Finds fake pictures by looking at details.