Fast online node labeling with graph subsampling
By: Yushen Huang , Ertai Luo , Reza Babenezhad and more
Potential Business Impact:
Makes big data graphs faster to search.
Large data applications rely on storing data in massive, sparse graphs with millions to trillions of nodes. Graph-based methods, such as node prediction, aim for computational efficiency regardless of graph size. Techniques like localized approximate personalized page rank (APPR) solve sparse linear systems with complexity independent of graph size, but is in terms of the maximum node degree, which can be much larger in practice than the average node degree for real-world large graphs. In this paper, we consider an \emph{online subsampled APPR method}, where messages are intentionally dropped at random. We use tools from graph sparsifiers and matrix linear algebra to give approximation bounds on the graph's spectral properties ($O(1/\epsilon^2)$ edges), and node classification performance (added $O(n\epsilon)$ overhead).
Similar Papers
Fast and Simple Densest Subgraph with Predictions
Data Structures and Algorithms
Finds the most connected group in networks faster.
Empirical Error Estimates for Graph Sparsification
Machine Learning (CS)
Makes computer graphs more accurate and faster.
When Noisy Labels Meet Class Imbalance on Graphs: A Graph Augmentation Method with LLM and Pseudo Label
Machine Learning (CS)
Fixes computer problems with messy, incomplete data.