Citations are not opinions: a corpus linguistics approach to understanding how citations are made
Domenic Rosati

TL;DR
This study uses corpus linguistics to analyze a large citation dataset, revealing that citation types are characterized by specific linguistic features and are not strongly linked to sentiment, challenging opinion-based views of citations.
Contribution
It introduces a corpus linguistics approach to classify citation types based on linguistic features, moving beyond sentiment analysis to understand citation functions empirically.
Findings
Low correlation between citation type and sentiment.
Citation collocates show low subjectivity across classes.
Citations function as claims-making devices rather than opinions.
Abstract
Citation content analysis seeks to understand citations based on the language used during the making of a citation. A key issue in citation content analysis is looking for linguistic structures that characterize distinct classes of citations for the purposes of understanding the intent and function of a citation. Previous works have focused on modeling linguistic features first and drawn conclusions on the language structures unique to each class of citation function based on the performance of a classification task or inter-annotator agreement. In this study, we start with a large sample of a pre-classified citation corpus, 2 million citations from each class of the scite Smart Citation dataset (supporting, disputing, and mentioning citations), and analyze its corpus linguistics in order to reveal the unique and statistically significant language structures belonging to each type of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Biomedical Text Mining and Ontologies
