Efficient Partition-based Approaches for Diversified Top-k Subgraph Matching
By: Liuyi Chen , Yuchen Hu , Zhengyi Yang and more
Potential Business Impact:
Finds different patterns in connected data faster.
Subgraph matching is a core task in graph analytics, widely used in domains such as biology, finance, and social networks. Existing top-k diversified methods typically focus on maximizing vertex coverage, but often return results in the same region, limiting topological diversity. We propose the Distance-Diversified Top-k Subgraph Matching (DTkSM) problem, which selects k isomorphic matches with maximal pairwise topological distances to better capture global graph structure. To address its computational challenges, we introduce the Partition-based Distance Diversity (PDD) framework, which partitions the graph and retrieves diverse matches from distant regions. To enhance efficiency, we develop two optimizations: embedding-driven partition filtering and densest-based partition selection over a Partition Adjacency Graph. Experiments on 12 real world datasets show our approach achieves up to four orders of magnitude speedup over baselines, with 95% of results reaching 80% of optimal distance diversity and 100% coverage diversity.
Similar Papers
Differentially Private Densest-$k$-Subgraph
Data Structures and Algorithms
Finds important groups in secret data safely.
A customizable inexact subgraph matching algorithm for attributed graphs
Data Structures and Algorithms
Finds hidden patterns in messy data relationships.
DS-Span: Single-Phase Discriminative Subgraph Mining for Efficient Graph Embeddings
Machine Learning (CS)
Finds important patterns in data faster.