Graph neural networks extrapolate out-of-distribution for shortest paths
By: Robert R. Nerem , Samantha Chen , Sanjoy Dasgupta and more
Potential Business Impact:
Teaches computers to solve any path problem.
Neural networks (NNs), despite their success and wide adoption, still struggle to extrapolate out-of-distribution (OOD), i.e., to inputs that are not well-represented by their training dataset. Addressing the OOD generalization gap is crucial when models are deployed in environments significantly different from the training set, such as applying Graph Neural Networks (GNNs) trained on small graphs to large, real-world graphs. One promising approach for achieving robust OOD generalization is the framework of neural algorithmic alignment, which incorporates ideas from classical algorithms by designing neural architectures that resemble specific algorithmic paradigms (e.g. dynamic programming). The hope is that trained models of this form would have superior OOD capabilities, in much the same way that classical algorithms work for all instances. We rigorously analyze the role of algorithmic alignment in achieving OOD generalization, focusing on graph neural networks (GNNs) applied to the canonical shortest path problem. We prove that GNNs, trained to minimize a sparsity-regularized loss over a small set of shortest path instances, exactly implement the Bellman-Ford (BF) algorithm for shortest paths. In fact, if a GNN minimizes this loss within an error of $\epsilon$, it implements the BF algorithm with an error of $O(\epsilon)$. Consequently, despite limited training data, these GNNs are guaranteed to extrapolate to arbitrary shortest-path problems, including instances of any size. Our empirical results support our theory by showing that NNs trained by gradient descent are able to minimize this loss and extrapolate in practice.
Similar Papers
Rethinking Graph Out-Of-Distribution Generalization: A Learnable Random Walk Perspective
Machine Learning (CS)
Teaches computers to work with new, different data.
Evolving Graph Learning for Out-of-Distribution Generalization in Non-stationary Environments
Machine Learning (CS)
Helps computers learn from changing data better.
Out-of-Distribution Detection in Heterogeneous Graphs via Energy Propagation
Machine Learning (CS)
Finds strange new things in connected data.