Node similarity distribution of complex networks and its application in link prediction
Cunlai Pu, Jie Li, Jian Wang, and Tony Q. S. Quek

TL;DR
This paper analyzes the distribution of node similarity in complex networks, especially the common neighbor similarity, and links it to link prediction performance, providing theoretical solutions for evaluation metrics.
Contribution
It introduces a general framework for calculating CNS distributions and connects these to link prediction performance, offering theoretical solutions for evaluation metrics.
Findings
CNS distribution in ER networks follows a Poisson law.
Link prediction performance depends on CNS distributions of node pairs.
Theoretical formulas for precision and AUC are derived, reducing computational cost.
Abstract
Over the years, quantifying the similarity of nodes has been a hot topic in complex networks, yet little has been known about the distributions of node-similarity. In this paper, we consider a typical measure of node-similarity called the common neighbor based similarity (CNS). By means of the generating function, we propose a general framework for calculating the CNS distributions of node sets in various complex networks. In particular, we show that for the Erd\"{o}s-R\'{e}nyi (ER) random network, the CNS distribution of node sets of any particular size obeys the Poisson law. We also connect the node-similarity distribution to the link prediction problem. We found that the performance of link prediction depends solely on the CNS distributions of the connected and unconnected node pairs in the network. Furthermore, we derive theoretical solutions of two key evaluation metrics in link…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Opinion Dynamics and Social Influence · Bioinformatics and Genomic Networks
