Uncovering missing links with cold ends
Yu-Xiao Zhu, Linyuan L\"u, Qian-Ming Zhang, Tao Zhou

TL;DR
This paper investigates the challenge of predicting missing links involving low-degree nodes in networks, revealing that the Leicht-Holme-Newman index outperforms others under realistic sampling conditions and proposing an improved index for better accuracy.
Contribution
The study uncovers the bias in standard link prediction evaluation and introduces a parameter-dependent index that enhances prediction of low-degree node links.
Findings
Leicht-Holme-Newman index performs best for low-degree node links.
Proposed index significantly improves prediction accuracy.
Validation on real sampling methods confirms effectiveness.
Abstract
To evaluate the performance of prediction of missing links, the known data are randomly divided into two parts, the training set and the probe set. We argue that this straightforward and standard method may lead to terrible bias, since in real biological and information networks, missing links are more likely to be links connecting low-degree nodes. We therefore study how to uncover missing links with low-degree nodes, namely links in the probe set are of lower degree products than a random sampling. Experimental analysis on ten local similarity indices and four disparate real networks reveals a surprising result that the Leicht-Holme-Newman index [E. A. Leicht, P. Holme, and M. E. J. Newman, Phys. Rev. E 73, 026120 (2006)] performs the best, although it was known to be one of the worst indices if the probe set is a random sampling of all links. We further propose an parameter-dependent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
