Score: 0

Fixing the Pitfalls of Probabilistic Time-Series Forecasting Evaluation by Kernel Quadrature

Published: March 8, 2025 | arXiv ID: 2503.06079v2

By: Masaki Adachi, Masahiro Fujisawa, Michael A Osborne

Potential Business Impact:

Fixes how we check if predictions are good.

Business Areas:

Prediction Markets Financial Services

Despite the significance of probabilistic time-series forecasting models, their evaluation metrics often involve intractable integrations. The most widely used metric, the continuous ranked probability score (CRPS), is a strictly proper scoring function; however, its computation requires approximation. We found that popular CRPS estimators--specifically, the quantile-based estimator implemented in the widely used GluonTS library and the probability-weighted moment approximation--both exhibit inherent estimation biases. These biases lead to crude approximations, resulting in improper rankings of forecasting model performance when CRPS values are close. To address this issue, we introduced a kernel quadrature approach that leverages an unbiased CRPS estimator and employs cubature construction for scalable computation. Empirically, our approach consistently outperforms the two widely used CRPS estimators.

Trajectory learning for ensemble forecasts via the continuous ranked probability score: a Lorenz '96 case study

Numerical Analysis

Improves weather forecasts by learning from past predictions.

29 Aug 2025 0

87%

Improving Statistical Postprocessing for Extreme Wind Speeds using Tuned Weighted Scoring Rules

Applications

Improves wind storm predictions without hurting normal forecasts.

10 Mar 2025 0

87%

Probabilistic measures afford fair comparisons of AIWP and NWP model output

Applications

Compares weather forecasts to find the best one.

4 Jun 2025 0

View PDF Login to Bookmark

Page Count

11 pages

Fixing the Pitfalls of Probabilistic Time-Series Forecasting Evaluation by Kernel Quadrature

Fixes how we check if predictions are good.

Technical Abstract

Trajectory learning for ensemble forecasts via the continuous ranked probability score: a Lorenz '96 case study

Improving Statistical Postprocessing for Extreme Wind Speeds using Tuned Weighted Scoring Rules

Probabilistic measures afford fair comparisons of AIWP and NWP model output