Learning Shortest Paths When Data is Scarce
By: Dmytro Matsypura, Yu Pan, Hanzhao Wang
Potential Business Impact:
Fixes computer maps with real-world data.
Digital twins and other simulators are increasingly used to support routing decisions in large-scale networks. However, simulator outputs often exhibit systematic bias, while ground-truth measurements are costly and scarce. We study a stochastic shortest-path problem in which a planner has access to abundant synthetic samples, limited real-world observations, and an edge-similarity structure capturing expected behavioral similarity across links. We model the simulator-to-reality discrepancy as an unknown, edge-specific bias that varies smoothly over the similarity graph, and estimate it using Laplacian-regularized least squares. This approach yields calibrated edge cost estimates even in data-scarce regimes. We establish finite-sample error bounds, translate estimation error into path-level suboptimality guarantees, and propose a computable, data-driven certificate that verifies near-optimality of a candidate route. For cold-start settings without initial real data, we develop a bias-aware active learning algorithm that leverages the simulator and adaptively selects edges to measure until a prescribed accuracy is met. Numerical experiments on multiple road networks and traffic graphs further demonstrate the effectiveness of our methods.
Similar Papers
Knowledge-Guided Machine Learning for Stabilizing Near-Shortest Path Routing
Machine Learning (CS)
Helps computers send messages faster through networks.
Learning to Design City-scale Transit Routes
Machine Learning (CS)
Designs better bus routes, cutting wait times.
Contextual Strongly Convex Simulation Optimization: Optimize then Predict with Inexact Solutions
Machine Learning (Stat)
Helps computers make better choices faster.