Testing Suffixient Sets
By: Davide Cenzato, Francisco Olivares, Nicola Prezza
Potential Business Impact:
Finds text patterns faster by storing less.
Suffixient sets are a novel prefix array (PA) compression technique based on subsampling PA (rather than compressing the entire array like previous techniques used to do): by storing very few entries of PA (in fact, a compressed number of entries), one can prove that pattern matching via binary search is still possible provided that random access is available on the text. In this paper, we tackle the problems of determining whether a given subset of text positions is (1) a suffixient set or (2) a suffixient set of minimum cardinality. We provide linear-time algorithms solving these problems.
Similar Papers
Smallest Suffixient Sets as a Repetitiveness Measure
Formal Languages and Automata Theory
Finds patterns in repeating text faster.
Compressing Suffix Trees by Path Decompositions
Data Structures and Algorithms
Find text faster in huge files.
Explaining the Inherent Tradeoffs for Suffix Array Functionality: Equivalences between String Problems and Prefix Range Queries
Data Structures and Algorithms
Find words faster in huge texts.