Runaway Events Dominate the Heavy Tail of Citation Distributions
Michael Golosovsky, Sorin Solomon

TL;DR
This study analyzes citation distributions of physics papers, revealing that while a power-law fits most data, extremely highly cited papers exhibit a runaway tail behavior that can be predicted early.
Contribution
It provides the first large-scale evidence that citation distributions follow a power-law with a runaway tail, and introduces methods to predict extreme citation events.
Findings
Power-law fits 99.955% of citation data
Runaway tail behavior occurs at 1000-1500 citations
Autocorrelation measures can predict runaway events
Abstract
Statistical distributions with heavy tails are ubiquitous in natural and social phenomena. Since the entries in heavy tail have disproportional significance, the knowledge of its exact shape is very important. Citations of scientific papers form one of the best-known heavy tail distributions. Even in this case there is a considerable debate whether citation distribution follows the log-normal or power-law fit. The goal of our study is to solve this debate by measuring citation distribution for a very large and homogeneous data. We measured citation distribution for 418,438 Physics papers published in 1980-1989 and cited by 2008. While the log-normal fit deviates too strong from the data, the discrete power-law function with the exponent does better and fits 99.955% of the data. However, the extreme tail of the distribution deviates upward even from the power-law fit and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
