Semantic Analysis for Automated Evaluation of the Potential Impact of Research Articles
Neslihan Suzen, Alexander Gorban, Jeremy Levesley, Evgeny Mirkes

TL;DR
This paper introduces a novel informational semantics-based method for representing scientific texts, which effectively predicts a paper's future impact by classifying citation counts with 80% accuracy.
Contribution
It presents a new vector representation of text meaning based on information theory and demonstrates its effectiveness in predicting research impact.
Findings
Achieved 80% success in classifying highly-cited vs. little-cited articles.
Proposed a novel informational semantics approach for text representation.
Showed that semantic features are significant predictors of citation impact.
Abstract
Can the analysis of the semantics of words used in the text of a scientific paper predict its future impact measured by citations? This study details examples of automated text classification that achieved 80% success rate in distinguishing between highly-cited and little-cited articles. Automated intelligent systems allow the identification of promising works that could become influential in the scientific community. The problems of quantifying the meaning of texts and representation of human language have been clear since the inception of Natural Language Processing. This paper presents a novel method for vector representation of text meaning based on information theory and show how this informational semantics is used for text classification on the basis of the Leicester Scientific Corpus. We describe the experimental framework used to evaluate the impact of scientific articles…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Data Quality and Management · Big Data and Business Intelligence
