Predicting Long-Term Citations from Short-Term Linguistic Influence

Sandeep Soni; David Bamman; Jacob Eisenstein

arXiv:2210.13628·cs.CL·October 26, 2022·1 cites

Predicting Long-Term Citations from Short-Term Linguistic Influence

Sandeep Soni, David Bamman, Jacob Eisenstein

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel method to quantify linguistic influence in timestamped documents, which can predict future citations of research papers better than traditional citation-based metrics.

Contribution

It proposes a new approach combining contextual embeddings and Hawkes processes to measure linguistic influence and predict future citations.

Findings

01

Linguistic influence scores correlate with future citation counts.

02

The method outperforms baseline predictors including initial citations and lexical features.

03

Influence measurement is effective using only two years of post-publication data.

Abstract

A standard measure of the influence of a research paper is the number of times it is cited. However, papers may be cited for many reasons, and citation count offers limited information about the extent to which a paper affected the content of subsequent publications. We therefore propose a novel method to quantify linguistic influence in timestamped document collections. There are two main steps: first, identify lexical and semantic changes using contextual embeddings and word frequencies; second, aggregate information about these changes into per-document influence scores by estimating a high-dimensional Hawkes process with a low-rank parameter matrix. We show that this measure of linguistic influence is predictive of $future$ citations: the estimate of linguistic influence from the two years after a paper's publication is correlated with and predictive of its citation count…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sandeepsoni/contextual-leadership
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Complex Network Analysis Techniques