A Decade of In-text Citation Analysis based on Natural Language Processing and Machine Learning Techniques: An overview of empirical studies
Sehrish Iqbal, Saeed-Ul Hassan, Naif Radi Aljohani, Salem Alelyani,, Raheel Nawaz, Lutz Bornmann

TL;DR
This paper reviews a decade of research on in-text citation analysis using NLP and machine learning, highlighting advances in citation context, sentiment, classification, summarization, and recommendation.
Contribution
It provides a comprehensive overview of empirical studies applying NLP and machine learning to citation analysis over the past decade.
Findings
Significant growth in citation analysis research due to new datasets and techniques.
Advancements in citation sentiment and content analysis methods.
Development of citation-based recommendation systems.
Abstract
Citation analysis is one of the most frequently used methods in research evaluation. We are seeing significant growth in citation analysis through bibliometric metadata, primarily due to the availability of citation databases such as the Web of Science, Scopus, Google Scholar, Microsoft Academic, and Dimensions. Due to better access to full-text publication corpora in recent years, information scientists have gone far beyond traditional bibliometrics by tapping into advancements in full-text data processing techniques to measure the impact of scientific publications in contextual terms. This has led to technical developments in citation context and content analysis, citation classifications, citation sentiment analysis, citation summarisation, and citation-based recommendation. This article aims to narratively review the studies on these developments. Its primary focus is on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
