
TL;DR
This paper investigates citation misprints and copying behavior, revealing that most citations are copied from existing reference lists, which explains citation patterns and the popularity of certain papers.
Contribution
It introduces a stochastic model of citation copying that accounts for empirical citation distribution and misprint propagation in scientific literature.
Findings
Most misprints are repeated from earlier citations.
Citation copying explains the distribution of citations among papers.
A simple model reproduces observed citation patterns.
Abstract
We present empirical data on misprints in citations to twelve high-profile papers. The great majority of misprints are identical to misprints in articles that earlier cited the same paper. The distribution of the numbers of misprint repetitions follows a power law. We develop a stochastic model of the citation process, which explains these findings and shows that about 70-90% of scientific citations are copied from the lists of references used in other papers. Citation copying can explain not only why some misprints become popular, but also why some papers become highly cited. We show that a model where a scientist picks few random papers, cites them, and copies a fraction of their references accounts quantitatively for empirically observed distribution of citations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
