Score: 0

Fast online node labeling with graph subsampling

Published: March 21, 2025 | arXiv ID: 2503.16755v2

By: Yushen Huang , Ertai Luo , Reza Babenezhad and more

Potential Business Impact:

Makes big data graphs faster to search.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Large data applications rely on storing data in massive, sparse graphs with millions to trillions of nodes. Graph-based methods, such as node prediction, aim for computational efficiency regardless of graph size. Techniques like localized approximate personalized page rank (APPR) solve sparse linear systems with complexity independent of graph size, but is in terms of the maximum node degree, which can be much larger in practice than the average node degree for real-world large graphs. In this paper, we consider an \emph{online subsampled APPR method}, where messages are intentionally dropped at random. We use tools from graph sparsifiers and matrix linear algebra to give approximation bounds on the graph's spectral properties ($O(1/\epsilon^2)$ edges), and node classification performance (added $O(n\epsilon)$ overhead).