WISE: Web Information Satire and Fakeness Evaluation
By: Gaurab Chhetri, Subasish Das, Tausif Islam Chowdhury
Distinguishing fake or untrue news from satire or humor poses a unique challenge due to their overlapping linguistic features and divergent intent. This study develops WISE (Web Information Satire and Fakeness Evaluation) framework which benchmarks eight lightweight transformer models alongside two baseline models on a balanced dataset of 20,000 samples from Fakeddit, annotated as either fake news or satire. Using stratified 5-fold cross-validation, we evaluate models across comprehensive metrics including accuracy, precision, recall, F1-score, ROC-AUC, PR-AUC, MCC, Brier score, and Expected Calibration Error. Our evaluation reveals that MiniLM, a lightweight model, achieves the highest accuracy (87.58%) among all models, while RoBERTa-base achieves the highest ROC-AUC (95.42%) and strong accuracy (87.36%). DistilBERT offers an excellent efficiency-accuracy trade-off with 86.28\% accuracy and 93.90\% ROC-AUC. Statistical tests confirm significant performance differences between models, with paired t-tests and McNemar tests providing rigorous comparisons. Our findings highlight that lightweight models can match or exceed baseline performance, offering actionable insights for deploying misinformation detection systems in real-world, resource-constrained settings.
Similar Papers
Generalization Gaps in Political Fake News Detection: An Empirical Study on the LIAR Dataset
Computation and Language
Helps computers spot fake news better.
What Signals Really Matter for Misinformation Tasks? Evaluating Fake-News Detection and Virality Prediction under Real-World Constraints
Computation and Language
Spots fake news and predicts how fast it spreads.
A Theoretically Grounded Hybrid Ensemble for Reliable Detection of LLM-Generated Text
Computation and Language
Finds fake writing in schoolwork better.