Learning Shortest Paths When Data is Scarce
Dmytro Matsypura, Yu Pan, Hanzhao Wang

TL;DR
This paper introduces a method to accurately estimate shortest paths in large networks using limited real data, synthetic samples, and a smooth bias model, with theoretical guarantees and active learning strategies.
Contribution
It proposes a Laplacian-regularized bias estimation approach, finite-sample error bounds, and an active learning algorithm for data-efficient routing in scarce data scenarios.
Findings
Effective bias calibration in data-scarce regimes
Theoretical error bounds and suboptimality guarantees
Successful experiments on road and traffic networks
Abstract
Digital twins and other simulators are increasingly used to support routing decisions in large-scale networks. However, simulator outputs often exhibit systematic bias, while ground-truth measurements are costly and scarce. We study a stochastic shortest-path problem in which a planner has access to abundant synthetic samples, limited real-world observations, and an edge-similarity structure capturing expected behavioral similarity across links. We model the simulator-to-reality discrepancy as an unknown, edge-specific bias that varies smoothly over the similarity graph, and estimate it using Laplacian-regularized least squares. This approach yields calibrated edge cost estimates even in data-scarce regimes. We establish finite-sample error bounds, translate estimation error into path-level suboptimality guarantees, and propose a computable, data-driven certificate that verifies…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTraffic Prediction and Management Techniques · Traffic control and management · Privacy-Preserving Technologies in Data
