Impact of lexical and sentiment factors on the popularity of scientific papers
Julian Sienkiewicz, Eduardo G. Altmann

TL;DR
This study analyzes how textual and author-related factors of scientific papers influence citation counts, revealing non-linear relationships and identifying key features like author count and abstract complexity that boost citations.
Contribution
It provides a large-scale analysis of textual and author factors affecting citations, highlighting non-linear effects and identifying key influential features.
Findings
Number of authors positively correlates with citations.
Abstract length and complexity have a strong positive influence.
Correlations vary between most-cited and typical papers.
Abstract
We investigate how textual properties of scientific papers relate to the number of citations they receive. Our main finding is that correlations are non-linear and affect differently most-cited and typical papers. For instance, we find that in most journals short titles correlate positively with citations only for the most cited papers, for typical papers the correlation is in most cases negative. Our analysis of 6 different factors, calculated both at the title and abstract level of 4.3 million papers in over 1500 journals, reveals the number of authors, and the length and complexity of the abstract, as having the strongest (positive) influence on the number of citations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
