Integrating Score-Based Diffusion Models with Machine Learning-Enhanced Localization for Advanced Data Assimilation in Geological Carbon Storage
By: Gabriel Serrão Seabra , Nikolaj T. Mücke , Vinicius Luiz Santos Silva and more
Potential Business Impact:
Helps store carbon safely underground.
Accurate characterization of subsurface heterogeneity is important for the safe and effective implementation of geological carbon storage (GCS) projects. This paper explores how machine learning methods can enhance data assimilation for GCS with a framework that integrates score-based diffusion models with machine learning-enhanced localization in channelized reservoirs during CO$_2$ injection. We employ a machine learning-enhanced localization framework that uses large ensembles ($N_s = 5000$) with permeabilities generated by the diffusion model and states computed by simple ML algorithms to improve covariance estimation for the Ensemble Smoother with Multiple Data Assimilation (ESMDA). We apply ML algorithms to a prior ensemble of channelized permeability fields, generated with the geostatistical model FLUVSIM. Our approach is applied on a CO$_2$ injection scenario simulated using the Delft Advanced Research Terra Simulator (DARTS). Our ML-based localization maintains significantly more ensemble variance than when localization is not applied, while achieving comparable data-matching quality. This framework has practical implications for GCS projects, helping improve the reliability of uncertainty quantification for risk assessment.
Similar Papers
Generative Latent Diffusion Model for Inverse Modeling and Uncertainty Analysis in Geological Carbon Sequestration
Geophysics
Helps store carbon underground more safely.
Mitigating loss of variance in ensemble data assimilation: machine learning-based and distance-free localization
Machine Learning (CS)
Improves computer weather forecasts by reducing errors.
Hybrid machine learning data assimilation for marine biogeochemistry
Atmospheric and Oceanic Physics
Helps predict ocean changes by learning from data.