Inferring the minimum spanning tree from a sample network
Jonathan Larson, Jukka-Pekka Onnela

TL;DR
This paper investigates how well a sample network's minimum spanning tree reflects the population MST, providing theoretical probability results and simulation-based insights across different graph models.
Contribution
It proves the probability relation between sample and population MST edges and explores this through simulations on various graph types, offering sampling recommendations.
Findings
Probability that a sample MST edge is in the population MST is n/N.
Simulation results show similar patterns for complete, G(N,p), and normal graphs.
Results for BA and empirical HIV graphs are also similar.
Abstract
Minimum spanning trees (MSTs) are used in a variety of fields, from computer science to geography. Infectious disease researchers have used them to infer the transmission pathway of certain pathogens. However, these are often the MSTs of sample networks, not population networks, and surprisingly little is known about what can be inferred about a population MST from a sample MST. We prove that if nodes (the sample) are selected uniformly at random from a complete graph with nodes and unique edge weights (the population), the probability that an edge is in the population graph's MST given that it is in the sample graph's MST is . We use simulation to investigate this conditional probability for graphs, Barab\'{a}si-Albert (BA) graphs, graphs whose nodes are distributed in according to a bivariate standard normal distribution, and an empirical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Stochastic processes and statistical mechanics · Bayesian Methods and Mixture Models
