A Markov-Chain Characterization of Finite-State Dimension and a Generalization of Agafonov's Theorem
By: Laurent Bienvenu, Hugo Gimbert, Subin Pulari
Potential Business Impact:
Measures how random a sequence is for computers.
Finite-state dimension quantifies the asymptotic rate of information in an infinite sequence as perceived by finite automata. For a fixed alphabet, the infinite sequences that have maximal finite-state dimension are exactly those that are Borel normal, i.e., in which all words of any given length appear with the same frequency. A theorem of Schnorr and Stimm (1972) shows that a real number is Borel normal if and only if, for every finite-state irreducible Markov chain with fair transitions, when the chain is simulated using the binary expansion of the given number, the empirical distribution of states converges to its stationary distribution. In this paper we extend this correspondence beyond normal numbers. We show that the finite-state dimension of a sequence can be characterized in terms of the conditional Kullback-Leibler divergence between the limiting distributions arising from the simulation of Markov chains using the given sequence and their stationary distributions. This provides a new information-theoretic characterization of finite-state dimension which generalizes the Schnorr-Stimm result. As an application, we prove a generalization of Agafonov's theorem for normal numbers. Agafonov's theorem states that a sequence is normal if and only if every subsequence selected by a finite automaton is also normal. We extend this to arbitrary sequences by establishing a tight quantitative relationship between the finite-state dimension of a sequence and the finite-state dimensions of its automatic subsequences.
Similar Papers
Finite State Dimension and The Davenport Erdős Theorem
Information Theory
Makes number sequences have different "randomness" levels.
Multihead Finite-State Compression
Information Theory
Makes computers understand patterns in data better.
Multihead Finite-State Dimension
Information Theory
Helps computers predict future symbols in data streams.