Modeling a Century of Citation Distributions
Matthew L. Wallace, Vincent Larivi\`ere, Yves Gingras

TL;DR
This paper analyzes a century of citation data to understand the dynamics of scientific recognition, proposing models that explain uncitedness and shifts in citation practices over time.
Contribution
It introduces a simple random selection model for citation distributions and demonstrates the effectiveness of stretched-exponential and Tsallis functions in fitting empirical data.
Findings
Uncitedness depends on publication and citation volumes.
Citation distributions are well modeled by stretched-exponential and Tsallis functions.
Identifies a significant shift in citation practices around 1960.
Abstract
Changes in citation distributions over 100 years can reveal much about the evolution of the scientific communities or disciplines. The prevalence of uncited papers or of highly-cited papers, with respect to the bulk of publications, provides important clues as to the dynamics of scientific research. Using 25 million papers and 600 million references from the Web of Science over the 1900-2006 period, this paper proposes a simple model based on a random selection process to explain the "uncitedness" phenomenon and its decline in recent years. We show that the proportion of uncited papers is a function of 1) the number of articles published in a given year (the competing papers) and 2) the number of articles subsequently published (the citing papers) and the number of references they contain. Using uncitedness as a departure point, we demonstrate the utility of the stretched-exponential…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicsscientometrics and bibliometrics research
