Search for Evergreens in Science: A Functional Data Analysis
Ruizhi Zhang, Jian Wang, Yajun Mei

TL;DR
This paper introduces a functional data analysis approach to identify and classify citation trajectory patterns of scientific papers, revealing a distinct evergreen cluster with sustained citations over 30 years.
Contribution
It develops a novel functional Poisson regression model and clustering method to analyze long-term citation patterns, specifically identifying evergreen papers.
Findings
Existence of a distinct evergreen citation cluster
Method successfully classifies papers into different citation trajectory groups
Provides insights into long-term scientific impact patterns
Abstract
Evergreens in science are papers that display a continual rise in annual citations without decline, at least within a sufficiently long time period. Aiming to better understand evergreens in particular and patterns of citation trajectory in general, this paper develops a functional data analysis method to cluster citation trajectories of a sample of 1699 research papers published in 1980 in the American Physical Society (APS) journals. We propose a functional Poisson regression model for individual papers' citation trajectories, and fit the model to the observed 30-year citations of individual papers by functional principal component analysis and maximum likelihood estimation. Based on the estimated paper-specific coefficients, we apply the K-means clustering algorithm to cluster papers into different groups, for uncovering general types of citation trajectories. The result demonstrates…
Click any figure to enlarge with its caption.
Figure 1
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Search for Evergreens in Science: A Functional Data Analysis111Ruizhi Zhang, Jian Wang & Yajun Mei. (2017). Search for evergreens in science: A functional data analysis. Journal of Informetrics, 11(3), 629–644. http://dx.doi.org/10.1016/j.joi.2017.05.007
©2017 Elsevier Ltd.
The authors thank the editor and three anonymous referees for their constructive comments which have substantially improved this paper. R. Zhang and Y. Mei were supported in part by the NSF grant CMMI-1362876, and J. Wang by a postdoctoral fellowship from the Research Foundation – Flanders (FWO). Data used in this paper are from a bibliometric database developed by the Competence Center for Bibliometrics for the German Science System (KB) and derived from the 1980 to 2012 Science Citation Index Expanded (SCI-E), Social Sciences Citation Index (SSCI), Arts and Humanities Citation Index (AHCI), Conference Proceedings Citation Index–Science (CPCI-S), and Conference Proceedings Citation Index–Social Science & Humanities (CPCI-SSH) prepared by Thomson Reuters (Scientific) Inc. (TR®), Philadelphia, Pennsylvania, USA: ©Copyright Thomson Reuters (Scientific) 2013. KB is funded by the German Federal Ministry of Education and Research (BMBF, project number: 01PQ08004A).
Ruizhi Zhang1, Jian Wang2,3 & Yajun Mei1
1H. Milton Stewart School of Industrial & Systems Engineering, Georgia Institute of Technology
2Center for R&D Monitoring and Department of Managerial Economics, Strategy & Innovation, KU Leuven
3German Center for Higher Education Research and Science Studies, DZHW Berlin
Emails: [email protected], [email protected], [email protected]
(May 17, 2017)
Abstract
Evergreens in science are papers that display a continual rise in annual citations without decline, at least within a sufficiently long time period. Aiming to better understand evergreens in particular and patterns of citation trajectory in general, this paper develops a functional data analysis method to cluster citation trajectories of a sample of 1699 research papers published in 1980 in the American Physical Society (APS) journals. We propose a functional Poisson regression model for individual papers’ citation trajectories, and fit the model to the observed 30-year citations of individual papers by functional principal component analysis and maximum likelihood estimation. Based on the estimated paper-specific coefficients, we apply the K-means clustering algorithm to cluster papers into different groups, for uncovering general types of citation trajectories. The result demonstrates the existence of an evergreen cluster of papers that do not exhibit any decline in annual citations over 30 years.
Keywords: citation trajectory; evergreen; functional Poisson regression; functional principal component analysis; K-means clustering
See pages - of CIT_FDA_170517_Final.pdf
