Score: 0

What Signals Really Matter for Misinformation Tasks? Evaluating Fake-News Detection and Virality Prediction under Real-World Constraints

Published: December 2, 2025 | arXiv ID: 2512.02552v1

By: Francesco Paolo Savatteri, Chahan Vidal-Gorène, Florian Cafiero

Potential Business Impact:

Spots fake news and predicts how fast it spreads.

Business Areas:

Text Analytics Data and Analytics, Software

We present an evaluation-driven study of two practical tasks regarding online misinformation: (i) fake-news detection and (ii) virality prediction in the context of operational settings, with the necessity for rapid reaction. Using the EVONS and FakeNewsNet datasets, we compare textual embeddings (RoBERTa; with a control using Mistral) against lightweight numeric features (timing, follower counts, verification, likes) and sequence models (GRU, gating architectures, Transformer encoders). We show that textual content alone is a strong discriminator for fake-news detection, while numeric-only pipelines remain viable when language models are unavailable or compute is constrained. Virality prediction is markedly harder than fake-news detection and is highly sensitive to label construction; in our setup, a median-based ''viral'' split (<50 likes) is pragmatic but underestimates real-world virality, and time-censoring for engagement features is desirable yet difficult under current API limits. Dimensionality-reduction analyses suggest non-linear structure is more informative for virality than for fake-news detection (t-SNE > PCA on numeric features). Swapping RoBERTa for Mistral embeddings yields only modest deltas, leaving conclusions unchanged. We discuss implications for evaluation design and report reproducibility constraints that realistically affect the field. We release splits and code where possible and provide guidance for metric selection.

Early Multimodal Prediction of Cross-Lingual Meme Virality on Reddit: A Time-Window Analysis

Artificial Intelligence

Predicts if memes will go viral very fast.

7 Oct 2025 0

88%

Is Less Really More? Fake News Detection with Limited Information

Information Retrieval

Finds fake news using less text.

2 Apr 2025 2

87%

Simulating Misinformation Propagation in Social Networks using Large Language Models

Social and Information Networks

Finds how fake news spreads and how to stop it.

13 Nov 2025 1

View PDF Login to Bookmark

Page Count

7 pages

What Signals Really Matter for Misinformation Tasks? Evaluating Fake-News Detection and Virality Prediction under Real-World Constraints

Spots fake news and predicts how fast it spreads.

Technical Abstract

Early Multimodal Prediction of Cross-Lingual Meme Virality on Reddit: A Time-Window Analysis

Is Less Really More? Fake News Detection with Limited Information

Simulating Misinformation Propagation in Social Networks using Large Language Models