Score: 0

Merging RLBWTs adaptively

Published: November 21, 2025 | arXiv ID: 2511.16953v1

By: Travis Gagie

Potential Business Impact:

Merges compressed text data much faster.

Business Areas:
A/B Testing Data and Analytics

We show how to merge run-length compressed Burrows-Wheeler Transforms (RLBWTs) quickly and in $O (R)$ space, where $R$ is the total number of runs in them, when a certain parameter is small. Specifically, we consider the boundaries in their combined extended Burrows-Wheeler Transform (eBWT) between blocks of characters from the same original RLBWT, and denote by $L$ the sum of the longest common prefix (LCP) values at those boundaries. We show how to merge the RLBWTs in $\tilde{O} (L + σ+ R)$ time, where $σ$ is the alphabet size. We conjecture that $L$ tends to be small when the strings (or sets of strings) underlying the original RLBWTs are repetitive but dissimilar.

Page Count
4 pages

Category
Computer Science:
Data Structures and Algorithms