Score: 0

LMG Index: A Robust Learned Index for Multi-Dimensional Performance Balance

Published: December 31, 2025 | arXiv ID: 2512.24824v1

By: Yuzhen Chen, Bin Yao

Index structures are fundamental for efficient query processing on large-scale datasets. Learned indexes model the indexing process as a prediction problem to overcome the inherent trade-offs of traditional indexes. However, most existing learned indexes optimize only for limited objectives like query latency or space usage, neglecting other practical evaluation dimensions such as update efficiency and stability. Moreover, many learned indexes rely on assumptions about data distributions or workloads, lacking theoretical guarantees when facing unknown or evolving scenarios, which limits their generality in real-world systems. In this paper, we propose LMIndex, a robust framework for learned indexing that leverages a efficient query/update top-layer structure (theoretically $O(1)$ when the key type is fixed) and a efficient optimal error threshold training algorithm (approach $O(1)$ in practice). Building upon this, we develop LMG (LMIndex with gaps), a variant employing a novel gap allocation strategy to enhance update performance and maintain stability under dynamic workloads. Extensive evaluations show that LMG achieves competitive or leading performance, including bulk loading (up to 8.25$\times$ faster), point queries (up to 1.49$\times$ faster), range queries (up to 4.02$\times$ faster than B+Tree), update (up to 1.5$\times$ faster on read-write workloads), stability (up to 82.59$\times$ lower coefficient of variation), and space usage (up to 1.38$\times$ smaller). These results demonstrate that LMG effectively breaks the multi-dimensional performance trade-offs inherent in state-of-the-art approaches, offering a balanced and versatile framework.

Category
Computer Science:
Databases