Score: 1

Joint leave-group-out cross-validation in Bayesian spatial models

Published: April 22, 2025 | arXiv ID: 2504.15586v1

By: Alex Cooper, Aki Vehtari, Catherine Forbes

Potential Business Impact:

Finds better ways to test computer predictions.

Business Areas:
A/B Testing Data and Analytics

Cross-validation (CV) is a widely-used method of predictive assessment based on repeated model fits to different subsets of the available data. CV is applicable in a wide range of statistical settings. However, in cases where data are not exchangeable, the design of CV schemes should account for suspected correlation structures within the data. CV scheme designs include the selection of left-out blocks and the choice of scoring function for evaluating predictive performance. This paper focuses on the impact of two scoring strategies for block-wise CV applied to spatial models with Gaussian covariance structures. We investigate, through several experiments, whether evaluating the predictive performance of blocks of left-out observations jointly, rather than aggregating individual (pointwise) predictions, improves model selection performance. Extending recent findings for data with serial correlation (such as time-series data), our experiments suggest that joint scoring reduces the variability of CV estimates, leading to more reliable model selection, particularly when spatial dependence is strong and model differences are subtle.

Country of Origin
🇦🇺 🇫🇮 Australia, Finland

Page Count
46 pages

Category
Statistics:
Methodology