Buffered Partially-Persistent External-Memory Search Trees
By: Gerth Stølting Brodal, Casper Moldrup Rysgaard, Rolf Svenning
Potential Business Impact:
Keeps old data safe while adding new info.
We present an optimal partially-persistent external-memory search tree with amortized I/O bounds matching those achieved by the non-persistent $B^{\varepsilon}$-tree by Brodal and Fagerberg [SODA 2003]. In a partially-persistent data structure each update creates a new version of the data structure, where all past versions can be queried, but only the current version can be updated. All operations should be efficient with respect to the size $N_v$ of the accessed version $v$. For any parameter $0<\varepsilon<1$, our data structure supports insertions and deletions in amortized $O\!\left(\frac{1}{\varepsilon B^{1-\varepsilon}}\log_B N_v\right)$ I/Os, where $B$ is the external-memory block size. It also supports successor and range reporting queries in amortized $O\!\left(\frac{1}{\varepsilon}\log_B N_v+K/B\right)$ I/Os, where $K$ is the number of values reported. The space usage of the data structure is linear in the total number of updates. We make the standard and minimal assumption that the internal memory has size $M \geq 2B$. The previous state-of-the-art external-memory partially-persistent search tree by Arge, Danner and Teh [JEA 2003] supports all operations in worst-case $O\!\left(\log_B N_v+K/B\right)$ I/Os, matching the bounds achieved by the classical B-tree by Bayer and McCreight [Acta Informatica 1972]. Our data structure successfully combines buffering updates with partial persistence. The I/O bounds can also be achieved in the worst-case sense, by slightly modifying our data structure and under the requirement that the memory size $M = \Omega\!\left(B^{1-\varepsilon}\log_2(\max_v N_v)\right)$. The worst-case result slightly improves the memory requirement over the previous ephemeral external-memory dictionary by Das, Iacono, and Nekrich (ISAAC 2022), who achieved matching worst-case I/O bounds but required $M=\Omega\!\left(B\log_B N\right)$.
Similar Papers
Lazy B-Trees
Data Structures and Algorithms
Makes computer searches faster, even with lots of data.
BS-tree: A gapped data-parallel B-tree
Databases
Makes computer memory searches much faster.
On Incremental Approximate Shortest Paths in Directed Graphs
Data Structures and Algorithms
Finds shortest paths faster when roads change.