The cost of ensembling: is it always worth combining?
By: Marco Zanotti
Potential Business Impact:
Makes computer predictions faster and cheaper.
Given the continuous increase in dataset sizes and the complexity of forecasting models, the trade-off between forecast accuracy and computational cost is emerging as an extremely relevant topic, especially in the context of ensemble learning for time series forecasting. To asses it, we evaluated ten base models and eight ensemble configurations across two large-scale retail datasets (M5 and VN1), considering both point and probabilistic accuracy under varying retraining frequencies. We showed that ensembles consistently improve forecasting performance, particularly in probabilistic settings. However, these gains come at a substantial computational cost, especially for larger, accuracy-driven ensembles. We found that reducing retraining frequency significantly lowers costs, with minimal impact on accuracy, particularly for point forecasts. Moreover, efficiency-driven ensembles offer a strong balance, achieving competitive accuracy with considerably lower costs compared to accuracy-optimized combinations. Most importantly, small ensembles of two or three models are often sufficient to achieve near-optimal results. These findings provide practical guidelines for deploying scalable and cost-efficient forecasting systems, supporting the broader goals of sustainable AI in forecasting. Overall, this work shows that careful ensemble design and retraining strategy selection can yield accurate, robust, and cost-effective forecasts suitable for real-world applications.
Similar Papers
Multi-layer Stack Ensembles for Time Series Forecasting
Machine Learning (CS)
Makes computer predictions of future events more accurate.
On the retraining frequency of global forecasting models
Applications
Saves computer power by retraining less often.
Using ensemble methods of machine learning to predict real estate prices
Machine Learning (CS)
Predicts house prices more accurately.