From regular expressions to deterministic finite automata: $2^{\frac{n}{2}+\sqrt{n}(\log n)^{Θ(1)}}$ states are necessary and sufficient
By: Olga Martynova, Alexander Okhotin
Potential Business Impact:
Makes computer programs run much faster.
It is proved that every regular expression of alphabetic width $n$, that is, with $n$ occurrences of symbols of the alphabet, can be transformed into a deterministic finite automaton (DFA) with $2^{\frac{n}{2}+(\frac{\log_2 e}{2\sqrt{2}}+o(1))\sqrt{n\ln n}}$ states recognizing the same language (the best upper bound up to date is $2^n$). At the same time, it is also shown that this bound is close to optimal, namely, that there exist regular expressions of alphabetic width $n$ over a two-symbol alphabet, such that every DFA for the same language has at least $2^{\frac{n}{2}+(\sqrt{2} + o(1))\sqrt{\frac{n}{\ln n}}}$ states (the previously known lower bound is $\frac{5}{4}2^{\frac{n}{2}}$). The same bounds are obtained for an intermediate problem of determinizing nondetermistic finite automata (NFA) with each state having all incoming transitions by the same symbol.
Similar Papers
Polynomial Complementation of Nondeterministic 2-Way Finite Automata by 1-Limited Automata
Formal Languages and Automata Theory
Makes computers understand all languages perfectly.
Nondeterminism makes unary 1-limited automata concise
Formal Languages and Automata Theory
Makes computers understand patterns much faster.
Deterministic Suffix-reading Automata
Formal Languages and Automata Theory
Lets computers read words faster by skipping letters.