Comparing Published Scientific Journal Articles to Their Pre-print Versions
Martin Klein, Peter Broadwell, Sharon E. Farb, Todd Grappone

TL;DR
This study empirically compares pre-print and final published scientific articles, finding minimal textual differences, which questions the added value of publishers' contributions and impacts library access decisions.
Contribution
It provides the first large-scale empirical analysis quantifying textual differences between pre-prints and final publications, challenging assumptions about publishers' added value.
Findings
Minimal textual changes from pre-print to final version
Questions the extent of publishers' added value
Implications for library subscription decisions
Abstract
Academic publishers claim that they add value to scholarly communications by coordinating reviews and contributing and enhancing text during publication. These contributions come at a considerable cost: U.S. academic libraries paid $1.7 billion for serial subscriptions in 2008 alone. Library budgets, in contrast, are flat and not able to keep pace with serial price inflation. We have investigated the publishers' value proposition by conducting a comparative study of pre-print papers and their final published counterparts. This comparison had two working assumptions: 1) if the publishers' argument is valid, the text of a pre-print paper should vary measurably from its corresponding final published version, and 2) by applying standard similarity measures, we should be able to detect and quantify such differences. Our analysis revealed that the text contents of the scientific papers…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicsscientometrics and bibliometrics research · Publishing and Scholarly Communication · Library Collection Development and Digital Resources
