SuryaBench: Benchmark Dataset for Advancing Machine Learning in Heliophysics and Space Weather Prediction
By: Sujit Roy , Dinesha V. Hegde , Johannes Schmude and more
Potential Business Impact:
Helps predict solar storms using AI.
This paper introduces a high resolution, machine learning-ready heliophysics dataset derived from NASA's Solar Dynamics Observatory (SDO), specifically designed to advance machine learning (ML) applications in solar physics and space weather forecasting. The dataset includes processed imagery from the Atmospheric Imaging Assembly (AIA) and Helioseismic and Magnetic Imager (HMI), spanning a solar cycle from May 2010 to July 2024. To ensure suitability for ML tasks, the data has been preprocessed, including correction of spacecraft roll angles, orbital adjustments, exposure normalization, and degradation compensation. We also provide auxiliary application benchmark datasets complementing the core SDO dataset. These provide benchmark applications for central heliophysics and space weather tasks such as active region segmentation, active region emergence forecasting, coronal field extrapolation, solar flare prediction, solar EUV spectra prediction, and solar wind speed estimation. By establishing a unified, standardized data collection, this dataset aims to facilitate benchmarking, enhance reproducibility, and accelerate the development of AI-driven models for critical space weather prediction tasks, bridging gaps between solar physics, machine learning, and operational forecasting.
Similar Papers
Surya: Foundation Model for Heliophysics
Solar and Stellar Astrophysics
Predicts solar flares and space weather events.
Contrastive Heliophysical Image Pretraining for Solar Dynamics Observatory Records
CV and Pattern Recognition
Makes solar pictures easier for computers to understand.
OceanForecastBench: A Benchmark Dataset for Data-Driven Global Ocean Forecasting
Machine Learning (CS)
Helps predict ocean changes better and faster.