DRIFT: A Toolkit for Diachronic Analysis of Scientific Literature
Abheesht Sharma, Gunjan Chhablani, Harshit Pandey, Rajaswa Patil

TL;DR
DRIFT is an open-source toolkit designed for diachronic analysis of scientific literature, enabling researchers to track research trends and semantic changes over time using various analysis methods.
Contribution
The paper introduces DRIFT, a comprehensive, easy-to-use toolkit for diachronic research analysis, combining existing methods with new techniques for tracking scientific literature evolution.
Findings
Demonstrated utility on arXiv cs.CL corpus
Effectively tracks research trend dynamics
Identifies semantic drift and trend shifts
Abstract
In this work, we present to the NLP community, and to the wider research community as a whole, an application for the diachronic analysis of research corpora. We open source an easy-to-use tool coined: DRIFT, which allows researchers to track research trends and development over the years. The analysis methods are collated from well-cited research works, with a few of our own methods added for good measure. Succinctly put, some of the analysis methods are: keyword extraction, word clouds, predicting declining/stagnant/growing trends using Productivity, tracking bi-grams using Acceleration plots, finding the Semantic Drift of words, tracking trends using similarity, etc. To demonstrate the utility and efficacy of our tool, we perform a case study on the cs.CL corpus of the arXiv repository and draw inferences from the analysis methods. The toolkit and the associated code are available…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Topic Modeling · Biomedical Text Mining and Ontologies
