Nine Million Book Items and Eleven Million Citations: A Study of Book-Based Scholarly Communication Using OpenCitations
Yongjun Zhu, Erjia Yan, Silvio Peroni, Chao Che

TL;DR
This study analyzes a large open citation dataset to understand the role and characteristics of books in scholarly communication, revealing their citation patterns, temporal dynamics, and subject focus.
Contribution
It provides a comprehensive quantitative analysis of books' citation patterns using the extensive COCI dataset, highlighting their relatively low citation share and temporal citation dynamics.
Findings
Books account for less than 4% of total citations.
Most books are cited fewer than ten times.
Books take longer to reach citation peak but are cited over similar durations as journal articles.
Abstract
Books have been widely used to share information and contribute to human knowledge. However, the quantitative use of books as a method of scholarly communication is relatively unexamined compared to journal articles and conference papers. This study uses the COCI dataset (a comprehensive open citation dataset provided by OpenCitations) to explore books' roles in scholarly communication. The COCI data we analyzed includes 445,826,118 citations from 46,534,705 bibliographic entities. By analyzing such a large amount of data, we provide a thorough, multifaceted understanding of books. Among the investigated factors are 1) temporal changes to book citations; 2) book citation distributions; 3) years to citation peak; 4) citation half-life; and 5) characteristics of the most-cited books. Results show that books have received less than 4% of total citations, and have been cited mainly by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
