Normalization of direct citations in publication-level networks: Evaluation of six approaches
Peter Sj\"og{\aa}rde, Per Ahlgren

TL;DR
This study evaluates six normalization methods for direct citation relations in publication networks, finding that geometric normalization outperforms fractional normalization in clustering quality and accuracy.
Contribution
The paper provides a comprehensive comparison of normalization approaches for citation networks, highlighting the advantages of geometric normalization over fractional methods.
Findings
Normalization improves clustering quality over no normalization.
Geometric normalization results in fewer misassignments than fractional normalization.
Fractional normalization, though standard, causes more inaccurate publication assignments.
Abstract
Clustering of publication networks is an efficient way to obtain classifications of large collections of research publications. Such classifications can be used to, e.g., detect research topics, normalize citation relations, or explore the publication output of a unit. Citation networks can be created using a variety of approaches. Best practices to obtain classifications using clustering have been investigated, in particular the performance of different publication-publication relatedness measures. However, evaluation of different approaches to normalization of citation relations have not been explored to the same extent. In this paper, we evaluate five approaches to normalization of direct citation relations with respect to clustering solution quality in four data sets. A sixth approach is evaluated using no normalization. To assess the quality of clustering solutions, we use three…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Advanced Clustering Algorithms Research · Bioinformatics and Genomic Networks
