Lossy Compression of Scientific Data: Applications Constrains and Requirements
By: Franck Cappello , Allison Baker , Ebru Bozda and more
Potential Business Impact:
Shrinks huge science data without losing discoveries.
Increasing data volumes from scientific simulations and instruments (supercomputers, accelerators, telescopes) often exceed network, storage, and analysis capabilities. The scientific community's response to this challenge is scientific data reduction. Reduction can take many forms, such as triggering, sampling, filtering, quantization, and dimensionality reduction. This report focuses on a specific technique: lossy compression. Lossy compression retains all data points, leveraging correlations and controlled reduced accuracy. Quality constraints, especially for quantities of interest, are crucial for preserving scientific discoveries. User requirements also include compression ratio and speed. While many papers have been published on lossy compression techniques and reference datasets are shared by the community, there is a lack of detailed specifications of application needs that can guide lossy compression researchers and developers. This report fills this gap by reporting on the requirements and constraints of nine scientific applications covering a large spectrum of domains (climate, combustion, cosmology, fusion, light sources, molecular dynamics, quantum circuit simulation, seismology, and system logs). The report also details key lossy compression technologies (SZ, ZFP, MGARD, LC, SPERR, DCTZ, TEZip, LibPressio), discussing their history, principles, error control, hardware support, features, and impact. By presenting both application needs and compression technologies, the report aims to inspire new research to fill existing gaps.
Similar Papers
Challenges and Solutions in Selecting Optimal Lossless Data Compression Algorithms
Information Theory
Finds best way to shrink files without losing info.
AstroCompress: A benchmark dataset for multi-purpose compression of astronomical data
Artificial Intelligence
Makes telescopes send more pictures using less space.
Lossless Compression of Time Series Data: A Comparative Study
Information Theory
Makes storing and sending data much smaller.