Score: 0

COVID-19 Forecasting from U.S. Wastewater Surveillance Data: A Retrospective Multi-Model Study (2022-2024)

Published: November 30, 2025 | arXiv ID: 2512.01074v1

By: Faharudeen Alhassan , Hamed Karami , Amanda Bleichrodt and more

Potential Business Impact:

Predicts disease outbreaks using sewer water.

Business Areas:
Predictive Analytics Artificial Intelligence, Data and Analytics, Software

Accurate and reliable forecasting models are critical for guiding public health responses and policy decisions during pandemics such as COVID-19. Retrospective evaluation of model performance is essential for improving epidemic forecasting capabilities. In this study, we used COVID-19 wastewater data from CDC's National Wastewater Surveillance System to generate sequential weekly retrospective forecasts for the United States from March 2022 through September 2024, both at the national level and for four major regions (Northeast, Midwest, South, and West). We produced 133 weekly forecasts using 11 models, including ARIMA, generalized additive models (GAM), simple linear regression (SLR), Prophet, and the n-sub-epidemic framework (top-ranked, weighted-ensemble, and unweighted-ensemble variants). Forecast performance was assessed using mean absolute error (MAE), mean squared error (MSE), weighted interval score (WIS), and 95% prediction interval coverage. The n-sub-epidemic unweighted ensembles outperformed all other models at 3-4-week horizons, particularly at the national level and in the Midwest and West. ARIMA and GAM performed best at 1-2-week horizons in most regions, whereas Prophet and SLR consistently underperformed across regions and horizons. These findings highlight the value of region-specific modeling strategies and demonstrate the utility of the n-sub-epidemic framework for real-time outbreak forecasting using wastewater surveillance data.

Country of Origin
🇺🇸 United States

Page Count
38 pages

Category
Statistics:
Applications