Lossless Compression of Time Series Data: A Comparative Study
By: Jonas G. Matt, Pengcheng Huang, Balz Maag
Potential Business Impact:
Makes storing and sending data much smaller.
Our increasingly digital and connected world has led to the generation of unprecedented amounts of data. This data must be efficiently managed, transmitted, and stored to preserve resources and allow scalability. Data compression has therein been a key technology for a long time, resulting in a vast landscape of available techniques. This largest-to-date study analyzes and compares various lossless data compression methods for time series data. We present a unified framework encompassing two stages: data transformation and entropy encoding. We evaluate compression algorithms across both synthetic and real-world datasets with varying characteristics. Through ablation studies at each compression stage, we isolate the impact of individual components on overall compression performance -- revealing the strengths and weaknesses of different algorithms when facing diverse time series properties. Our study underscores the importance of well-configured and complete compression pipelines beyond individual components or algorithms; it offers a comprehensive guide for selecting and composing the most appropriate compression algorithms tailored to specific datasets.
Similar Papers
Lossless Compression: A New Benchmark for Time Series Model Evaluation
Machine Learning (CS)
Tests computer models by how well they shrink data.
Data Compression for Time Series Modelling: A Case Study of Smart Grid Demand Forecasting
Computational Engineering, Finance, and Science
Shrinks energy data without losing prediction power.
Challenges and Solutions in Selecting Optimal Lossless Data Compression Algorithms
Information Theory
Finds best way to shrink files without losing info.