Assessing the impact of Open Research Information Infrastructures using NLP driven full-text Scientometrics: A case study of the LXCat open-access platform
Kalp Pandya, Khushi Shah, Nirmal Shah, Nakshi Shah, Bhaskar Chaudhury

TL;DR
This paper introduces a domain-agnostic NLP-based scientometric framework to evaluate the impact of open research information infrastructures, exemplified by the LXCat platform in low temperature plasma research, beyond traditional citation metrics.
Contribution
The study develops a scalable, full-text NLP pipeline for analyzing data usage, research themes, and workflows, providing a novel, comprehensive impact assessment method for ORI platforms.
Findings
Identified evolving data reuse patterns in LTP research
Mapped thematic shifts and research priorities over a decade
Demonstrated the framework's transferability to other domains
Abstract
Open research information (ORI) play a central role in shaping how scientific knowledge is produced, disseminated, validated, and reused across the research lifecycle. While the visibility of such ORI infrastructures is often assessed through citation-based metrics, in this study, we present a full-text, natural language processing (NLP) driven scientometric framework to systematically quantify the impact of ORI infrastructures beyond citation counts, using the LXCat platform for low temperature plasma (LTP) research as a representative case study. The modeling of LTPs and interpretation of LTP experiments rely heavily on accurate data, much of which is hosted on LXCat, a community-driven, open-access platform central to the LTP research ecosystem. To investigate the scholarly impact of the LXCat platform over the past decade, we analyzed a curated corpus of full-text research articles…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsResearch Data Management Practices · Biomedical Text Mining and Ontologies · scientometrics and bibliometrics research
