Rethinking deep learning: linear regression remains a key benchmark in predicting terrestrial water storage
By: Wanshu Nie , Sujay V. Kumar , Junyu Chen and more
Potential Business Impact:
Simple math predicts water better than complex AI.
Recent advances in machine learning such as Long Short-Term Memory (LSTM) models and Transformers have been widely adopted in hydrological applications, demonstrating impressive performance amongst deep learning models and outperforming physical models in various tasks. However, their superiority in predicting land surface states such as terrestrial water storage (TWS) that are dominated by many factors such as natural variability and human driven modifications remains unclear. Here, using the open-access, globally representative HydroGlobe dataset - comprising a baseline version derived solely from a land surface model simulation and an advanced version incorporating multi-source remote sensing data assimilation - we show that linear regression is a robust benchmark, outperforming the more complex LSTM and Temporal Fusion Transformer for TWS prediction. Our findings highlight the importance of including traditional statistical models as benchmarks when developing and evaluating deep learning models. Additionally, we emphasize the critical need to establish globally representative benchmark datasets that capture the combined impact of natural variability and human interventions.
Similar Papers
Identifying Trustworthiness Challenges in Deep Learning Models for Continental-Scale Water Quality Prediction
Machine Learning (CS)
Makes water quality predictions more trustworthy for everyone.
Predicting and Interpolating Spatiotemporal Environmental Data: A Case Study of Groundwater Storage in Bangladesh
Machine Learning (CS)
Maps hidden water underground more accurately.
A Novel Deep Neural Network Architecture for Real-Time Water Demand Forecasting
Machine Learning (CS)
Predicts water use more accurately, with less complexity.