VeriTaS: The First Dynamic Benchmark for Multimodal Automated Fact-Checking
By: Mark Rothermel , Marcus Kornmann , Marcus Rohrbach and more
The growing scale of online misinformation urgently demands Automated Fact-Checking (AFC). Existing benchmarks for evaluating AFC systems, however, are largely limited in terms of task scope, modalities, domain, language diversity, realism, or coverage of misinformation types. Critically, they are static, thus subject to data leakage as their claims enter the pretraining corpora of LLMs. As a result, benchmark performance no longer reliably reflects the actual ability to verify claims. We introduce Verified Theses and Statements (VeriTaS), the first dynamic benchmark for multimodal AFC, designed to remain robust under ongoing large-scale pretraining of foundation models. VeriTaS currently comprises 24,000 real-world claims from 108 professional fact-checking organizations across 54 languages, covering textual and audiovisual content. Claims are added quarterly via a fully automated seven-stage pipeline that normalizes claim formulation, retrieves original media, and maps heterogeneous expert verdicts to a novel, standardized, and disentangled scoring scheme with textual justifications. Through human evaluation, we demonstrate that the automated annotations closely match human judgments. We commit to update VeriTaS in the future, establishing a leakage-resistant benchmark, supporting meaningful AFC evaluation in the era of rapidly evolving foundation models. We will make the code and data publicly available.
Similar Papers
Fact-Checking at Scale: Multimodal AI for Authenticity and Context Verification in Online Media
Multimedia
Checks if online videos and pictures are real.
Fact-Checking at Scale: Multimodal AI for Authenticity and Context Verification in Online Media
Multimedia
Checks if online videos and pictures are real.
Semi-automated Fact-checking in Portuguese: Corpora Enrichment using Retrieval with Claim extraction
Computation and Language
Helps stop fake news by finding proof.