A Unified Model for Cardinality Estimation by Learning from Data and Queries via Sum-Product Networks
By: Jiawei Liu , Ju Fan , Tongyu Liu and more
Potential Business Impact:
Makes computer databases find information faster.
Cardinality estimation is a fundamental component in database systems, crucial for generating efficient execution plans. Despite advancements in learning-based cardinality estimation, existing methods may struggle to simultaneously optimize the key criteria: estimation accuracy, inference time, and storage overhead, limiting their practical applicability in real-world database environments. This paper introduces QSPN, a unified model that integrates both data distribution and query workload. QSPN achieves high estimation accuracy by modeling data distribution using the simple yet effective Sum-Product Network (SPN) structure. To ensure low inference time and reduce storage overhead, QSPN further partitions columns based on query access patterns. We formalize QSPN as a tree-based structure that extends SPNs by introducing two new node types: QProduct and QSplit. This paper studies the research challenges of developing efficient algorithms for the offline construction and online computation of QSPN. We conduct extensive experiments to evaluate QSPN in both single-table and multi-table cardinality estimation settings. The experimental results have demonstrated that QSPN achieves superior and robust performance on the three key criteria, compared with state-of-the-art approaches.
Similar Papers
N-Parties Private Structure and Parameter Learning for Sum-Product Networks
Cryptography and Security
Keeps private data safe when learning from it.
Sketched Sum-Product Networks for Joins
Databases
Makes computer searches faster by guessing results.
The Space-Time Complexity of Sum-Product Queries
Databases
Makes searching information use less computer memory.