Can LLMs Predict Academic Collaboration? Topology Heuristics vs. LLM-Based Link Prediction on Real Co-authorship Networks
Fan Huang, Munjung Kim

TL;DR
This study evaluates whether large language models can predict future scientific collaborations using author profiles, demonstrating that LLMs outperform traditional topology heuristics in link prediction tasks on large co-authorship networks.
Contribution
It introduces a novel approach of using LLMs for link prediction in co-authorship networks, highlighting their ability to leverage author metadata beyond graph structure.
Findings
LLMs outperform topology heuristics in new-edge prediction tasks.
Author research concepts are the most significant predictive signal.
Providing graph features to LLMs reduces performance, indicating separate information channels.
Abstract
Can large language models (LLMs) predict which researchers will collaborate? We study this question through link prediction on real-world co-authorship networks from OpenAlex (9.96M authors, 108.7M edges), evaluating whether LLMs can predict future scientific collaborations using only author profiles, without access to graph structure. Using Qwen2.5-72B-Instruct across three historical eras of AI research, we find that LLMs and topology heuristics capture distinct signals and are strongest in complementary settings. On new-edge prediction under natural class imbalance, the LLM achieves AUROC 0.714--0.789, outperforming Common Neighbors, Jaccard, and Preferential Attachment, with recall up to 92.9\%; under balanced evaluation, the LLM outperforms \emph{all} topology heuristics in every era (AUROC 0.601--0.658 vs.\ best-heuristic 0.525--0.538); on continued edges, the LLM (0.687) is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
