Scalable Similarity Search over Large Attributed Bipartite Graphs
By: Xi Ou , Longlong Lin , Zeli Wang and more
Bipartite graphs are widely used to model relationships between entities of different types, where nodes are divided into two disjoint sets. Similarity search, a fundamental operation that retrieves nodes similar to a given query node, plays a crucial role in various real-world applications, including machine learning and graph clustering. However, existing state-of-the-art methods often struggle to accurately capture the unique structural properties of bipartite graphs or fail to incorporate the informative node attributes, leading to suboptimal performance. Besides, their high computational complexity limits scalability, making them impractical for large graphs with millions of nodes and tens of thousands of attributes. To overcome these challenges, we first introduce Attribute-augmented Hidden Personalized PageRank (AHPP), a novel random walk model designed to blend seamlessly both the higher-order bipartite structure proximity and attribute similarity. We then formulate the similarity search over attributed bipartite graphs as an approximate AHPP problem and propose two efficient push-style local algorithms with provable approximation guarantees. Finally, extensive experiments on real-world and synthetic datasets validate the effectiveness of AHPP and the efficiency of our proposed algorithms when compared with fifteen competitors.
Similar Papers
A customizable inexact subgraph matching algorithm for attributed graphs
Data Structures and Algorithms
Finds hidden patterns in messy data relationships.
Online Sparsification of Bipartite-Like Clusters in Graphs
Data Structures and Algorithms
Finds groups of connected things faster.
Efficient Partition-based Approaches for Diversified Top-k Subgraph Matching
Databases
Finds different patterns in connected data faster.