Multimodal Misinformation Detection Using Early Fusion of Linguistic, Visual, and Social Features
By: Gautam Kishore Shahi
Potential Business Impact:
Finds fake news using text, pictures, and user info.
Amid a tidal wave of misinformation flooding social media during elections and crises, extensive research has been conducted on misinformation detection, primarily focusing on text-based or image-based approaches. However, only a few studies have explored multimodal feature combinations, such as integrating text and images for building a classification model to detect misinformation. This study investigates the effectiveness of different multimodal feature combinations, incorporating text, images, and social features using an early fusion approach for the classification model. This study analyzed 1,529 tweets containing both text and images during the COVID-19 pandemic and election periods collected from Twitter (now X). A data enrichment process was applied to extract additional social features, as well as visual features, through techniques such as object detection and optical character recognition (OCR). The results show that combining unsupervised and supervised machine learning models improves classification performance by 15% compared to unimodal models and by 5% compared to bimodal models. Additionally, the study analyzes the propagation patterns of misinformation based on the characteristics of misinformation tweets and the users who disseminate them.
Similar Papers
A New Hybrid Intelligent Approach for Multimodal Detection of Suspected Disinformation on TikTok
CV and Pattern Recognition
Finds fake videos on TikTok using AI.
Exploring Modality Disruption in Multimodal Fake News Detection
Multimedia
Finds fake news faster by ignoring bad parts.
Large Language Models and Provenance Metadata for Determining the Relevance of Images and Videos in News Stories
Computation and Language
Finds fake news by checking text and pictures.