Score: 2

Advancing Event Forecasting through Massive Training of Large Language Models: Challenges, Solutions, and Broader Impacts

Published: July 25, 2025 | arXiv ID: 2507.19477v1

By: Sang-Woo Lee , Sohee Yang , Donghyun Kwak and more

BigTech Affiliations: Google

Potential Business Impact:

AI predicts future events with amazing accuracy.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Many recent papers have studied the development of superforecaster-level event forecasting LLMs. While methodological problems with early studies cast doubt on the use of LLMs for event forecasting, recent studies with improved evaluation methods have shown that state-of-the-art LLMs are gradually reaching superforecaster-level performance, and reinforcement learning has also been reported to improve future forecasting. Additionally, the unprecedented success of recent reasoning models and Deep Research-style models suggests that technology capable of greatly improving forecasting performance has been developed. Therefore, based on these positive recent trends, we argue that the time is ripe for research on large-scale training of superforecaster-level event forecasting LLMs. We discuss two key research directions: training methods and data acquisition. For training, we first introduce three difficulties of LLM-based event forecasting training: noisiness-sparsity, knowledge cut-off, and simple reward structure problems. Then, we present related ideas to mitigate these problems: hypothetical event Bayesian networks, utilizing poorly-recalled and counterfactual events, and auxiliary reward signals. For data, we propose aggressive use of market, public, and crawling datasets to enable large-scale training and evaluation. Finally, we explain how these technical advances could enable AI to provide predictive intelligence to society in broader areas. This position paper presents promising specific paths and considerations for getting closer to superforecaster-level AI technology, aiming to call for researchers' interest in these directions.

Future Is Unevenly Distributed: Forecasting Ability of LLMs Depends on What We're Asking

Machine Learning (CS)

Models guess future events better with more facts.

23 Nov 2025 0

91%

Leveraging Log Probabilities in Language Models to Forecast Future Events

Computation and Language

AI predicts future events with better accuracy.

8 Jan 2025 0

91%

Navigating Tomorrow: Reliably Assessing Large Language Models Performance on Future Event Prediction

Computation and Language

Helps computers guess what might happen next.

10 Jan 2025 0

View PDF Login to Bookmark

Country of Origin

🇺🇸 United States

Repos / Data Links

github.com

Page Count

29 pages

Advancing Event Forecasting through Massive Training of Large Language Models: Challenges, Solutions, and Broader Impacts

AI predicts future events with amazing accuracy.

Technical Abstract

Future Is Unevenly Distributed: Forecasting Ability of LLMs Depends on What We're Asking

Leveraging Log Probabilities in Language Models to Forecast Future Events

Navigating Tomorrow: Reliably Assessing Large Language Models Performance on Future Event Prediction