Which type of citation analysis generates the most accurate taxonomy of scientific and technical knowledge?
Richard Klavans, Kevin W. Boyack

TL;DR
This study compares citation-based methods for creating scientific taxonomies and finds that direct citation yields the most accurate clusters, outperforming bibliographic coupling and co-citation, especially at the topic level.
Contribution
It provides a comparative analysis of citation-based clustering methods and introduces new gold standards for evaluating taxonomy accuracy.
Findings
Direct citation produces more concentrated and accurate clusters.
Discipline-level journal schemas are less accurate than topic-level taxonomies.
Direct citation outperforms bibliographic coupling and co-citation in taxonomy accuracy.
Abstract
In 1965, Derek de Solla Price foresaw the day when a citation-based taxonomy of science and technology would be delineated and correspondingly used for science policy. A taxonomy needs to be comprehensive and accurate if it is to be useful for policy making, especially now that policy makers are utilizing citation-based indicators to evaluate people, institutions and laboratories. Determining the accuracy of a taxonomy, however, remains a challenge. Previous work on the accuracy of partition solutions is sparse, and the results of those studies, while useful, have not been definitive. In this study we compare the accuracies of topic-level taxonomies based on the clustering of documents using direct citation, bibliographic coupling, and co-citation. Using a set of new gold standards - articles with at least 100 references - we find that direct citation is better at concentrating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
