The Closer the Better: Similarity of Publication Pairs at Different Co-Citation Levels
Giovanni Colavizza, Kevin W. Boyack, Nees Jan van Eck, Ludo Waltman

TL;DR
This study analyzes how the similarity of co-cited articles increases as the co-citation level becomes more granular, revealing that textual and author overlaps grow closer at finer textual units across diverse journals.
Contribution
It provides a detailed analysis of how various similarity measures change across co-citation levels, highlighting the importance of granular co-citation data for scientific mapping and literature retrieval.
Findings
Similarity increases at finer co-citation levels
Main gain from journal to article co-citation
Consistent results across four diverse journals
Abstract
We investigate the similarities of pairs of articles which are co-cited at the different co-citation levels of the journal, article, section, paragraph, sentence and bracket. Our results indicate that textual similarity, intellectual overlap (shared references), author overlap (shared authors), proximity in publication time all rise monotonically as the co-citation level gets lower (from journal to bracket). While the main gain in similarity happens when moving from journal to article co-citation, all level changes entail an increase in similarity, especially section to paragraph and paragraph to sentence/bracket levels. We compare results from four journals over the years 2010-2015: Cell, the European Journal of Operational Research, Physics Letters B and Research Policy, with consistent general outcomes and some interesting differences. Our findings motivate the use of granular…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicsscientometrics and bibliometrics research · Advanced Text Analysis Techniques · Complex Network Analysis Techniques
