A Test of Lookahead Bias in LLM Forecasts
By: Zhenyu Gao, Wenxi Jiang, Yutong Yan
We develop a statistical test to detect lookahead bias in economic forecasts generated by large language models (LLMs). Using state-of-the-art pre-training data detection techniques, we estimate the likelihood that a given prompt appeared in an LLM's training corpus, a statistic we term Lookahead Propensity (LAP). We formally show that a positive correlation between LAP and forecast accuracy indicates the presence and magnitude of lookahead bias, and apply the test to two forecasting tasks: news headlines predicting stock returns and earnings call transcripts predicting capital expenditures. Our test provides a cost-efficient, diagnostic tool for assessing the validity and reliability of LLM-generated forecasts.
Similar Papers
A Fast and Effective Solution to the Problem of Look-ahead Bias in LLMs
Machine Learning (CS)
Removes bad financial predictions from AI.
What Does ChatGPT Make of Historical Stock Returns? Extrapolation and Miscalibration in LLM Stock Return Forecasts
General Finance
AI predicts stock prices, but is too optimistic.
Future Is Unevenly Distributed: Forecasting Ability of LLMs Depends on What We're Asking
Machine Learning (CS)
Models guess future events better with more facts.