Score: 2

Efficient Partition-based Approaches for Diversified Top-k Subgraph Matching

Published: November 24, 2025 | arXiv ID: 2511.19008v1

By: Liuyi Chen , Yuchen Hu , Zhengyi Yang and more

Potential Business Impact:

Finds different patterns in connected data faster.

Business Areas:

Big Data Data and Analytics

Subgraph matching is a core task in graph analytics, widely used in domains such as biology, finance, and social networks. Existing top-k diversified methods typically focus on maximizing vertex coverage, but often return results in the same region, limiting topological diversity. We propose the Distance-Diversified Top-k Subgraph Matching (DTkSM) problem, which selects k isomorphic matches with maximal pairwise topological distances to better capture global graph structure. To address its computational challenges, we introduce the Partition-based Distance Diversity (PDD) framework, which partitions the graph and retrieves diverse matches from distant regions. To enhance efficiency, we develop two optimizations: embedding-driven partition filtering and densest-based partition selection over a Partition Adjacency Graph. Experiments on 12 real world datasets show our approach achieves up to four orders of magnitude speedup over baselines, with 95% of results reaching 80% of optimal distance diversity and 100% coverage diversity.