Lempel-Ziv Complexity, Empirical Entropies, and Chain Rules
By: Neri Merhav
Potential Business Impact:
Makes computer files smaller using smart patterns.
We derive upper and lower bounds on the overall compression ratio of the 1978 Lempel-Ziv (LZ78) algorithm, applied independently to $k$-blocks of a finite individual sequence. Both bounds are given in terms of normalized empirical entropies of the given sequence. For the bounds to be tight and meaningful, the order of the empirical entropy should be small relative to $k$ in the upper bound, but large relative to $k$ in the lower bound. Several non-trivial conclusions arise from these bounds. One of them is a certain form of a chain rule of the Lempel-Ziv (LZ) complexity, which decomposes the joint LZ complexity of two sequences, say, $\bx$ and $\by$, into the sum of the LZ complexity of $\bx$ and the conditional LZ complexity of $\by$ given $\bx$ (up to small terms). The price of this decomposition, however, is in changing the length of the block. Additional conclusions are discussed as well.
Similar Papers
The LZ78 Source
Information Theory
Helps computers learn from data that changes.
LZD-style Compression Scheme with Truncation and Repetitions
Data Structures and Algorithms
Makes files smaller, faster, and better.
Discrete Layered Entropy, Conditional Compression and a Tighter Strong Functional Representation Lemma
Information Theory
Makes math problems about information easier.