STZ: A High Quality and High Speed Streaming Lossy Compression Framework for Scientific Data
By: Daoce Wang , Pascal Grosset , Jesus Pulido and more
Potential Business Impact:
Makes big science data smaller, faster, and easier to use.
Error-bounded lossy compression is one of the most efficient solutions to reduce the volume of scientific data. For lossy compression, progressive decompression and random-access decompression are critical features that enable on-demand data access and flexible analysis workflows. However, these features can severely degrade compression quality and speed. To address these limitations, we propose a novel streaming compression framework that supports both progressive decompression and random-access decompression while maintaining high compression quality and speed. Our contributions are three-fold: (1) we design the first compression framework that simultaneously enables both progressive decompression and random-access decompression; (2) we introduce a hierarchical partitioning strategy to enable both streaming features, along with a hierarchical prediction mechanism that mitigates the impact of partitioning and achieves high compression quality -- even comparable to state-of-the-art (SOTA) non-streaming compressor SZ3; and (3) our framework delivers high compression and decompression speed, up to 6.7$\times$ faster than SZ3.
Similar Papers
Lossy Compression of Scientific Data: Applications Constrains and Requirements
Instrumentation and Methods for Astrophysics
Shrinks huge science data without losing discoveries.
A High-Throughput GPU Framework for Adaptive Lossless Compression of Floating-Point Data
Databases
Shrinks big computer data without losing any details.
GPZ: GPU-Accelerated Lossy Compressor for Particle Data
Distributed, Parallel, and Cluster Computing
Makes huge science data smaller and faster.