Score: 1

When Facts Change: Probing LLMs on Evolving Knowledge with evolveQA

Published: October 22, 2025 | arXiv ID: 2510.19172v1

By: Nishanth Sridhar Nakshatri , Shamik Roy , Manoj Ghuhan Arivazhagan and more

BigTech Affiliations: Amazon

Potential Business Impact:

Helps computers remember facts that change over time.

Business Areas:
Q&A Community and Lifestyle

LLMs often fail to handle temporal knowledge conflicts--contradictions arising when facts evolve over time within their training data. Existing studies evaluate this phenomenon through benchmarks built on structured knowledge bases like Wikidata, but they focus on widely-covered, easily-memorized popular entities and lack the dynamic structure needed to fairly evaluate LLMs with different knowledge cut-off dates. We introduce evolveQA, a benchmark specifically designed to evaluate LLMs on temporally evolving knowledge, constructed from 3 real-world, time-stamped corpora: AWS updates, Azure changes, and WHO disease outbreak reports. Our framework identifies naturally occurring knowledge evolution and generates questions with gold answers tailored to different LLM knowledge cut-off dates. Through extensive evaluation of 12 open and closed-source LLMs across 3 knowledge probing formats, we demonstrate significant performance drops of up to 31% on evolveQA compared to static knowledge questions.

Country of Origin
🇺🇸 United States

Page Count
23 pages

Category
Computer Science:
Computation and Language