Formational bounds of link prediction in collaboration networks
Jinseok Kim, Jana Diesner

TL;DR
This paper investigates the fundamental limits of link prediction in collaboration networks, revealing that current methods can only predict about 25% of links involving existing members, due to inherent formation constraints.
Contribution
The study identifies the distribution of link formation types in large-scale collaboration networks and highlights the upper bounds of current link prediction methods.
Findings
Current link predictors can predict up to 25% of links involving existing members.
Most links involving new participants are inherently unpredictable.
Improving accuracy may be limited when the ratio of predictable links is low.
Abstract
Link prediction in collaboration networks is often solved by identifying structural properties of existing nodes that are disconnected at one point in time, and that share a link later on. The maximally possible recall rate or upper bound of this approach's success is capped by the proportion of links that are formed among existing nodes embedded in these properties. Consequentially, sustained ties as well as links that involve one or two new network participants are typically not predicted. The purpose of this study is to highlight formational constraints that need to be considered to increase the practical value of link prediction methods for collaboration networks. In this study, we identify the distribution of basic link formation types based on four large-scale, over-time collaboration networks, showing that current link predictors can maximally anticipate around 25% of links that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
