An Ensemble Approach for Research Article Identification: a Case Study   in Artificial Intelligence

Lie Tang; Xianke Zhou; Min Lu

arXiv:2304.09487·cs.DL·November 20, 2023·1 cites

An Ensemble Approach for Research Article Identification: a Case Study in Artificial Intelligence

Lie Tang, Xianke Zhou, Min Lu

PDF

Open Access

TL;DR

This paper introduces an ensemble method combining decision trees, sciBERT, regex, and SVM to accurately identify AI research articles, outperforming existing search techniques and revealing insights into research trends.

Contribution

The study presents a novel ensemble approach for research article identification that improves accuracy and provides research trend insights in AI.

Findings

01

Captured 97% of AI articles with 0.92 precision

02

Increased F1 score by 0.15 over existing methods

03

Identified more articles in subfields like feature extraction

Abstract

This study presents an ensemble approach that addresses the challenges of identification and analysis of research articles in rapidly evolving fields, using the field of Artificial Intelligence (AI) as a case study. Our approach included using decision tree, sciBERT and regular expression matching on different fields of the articles, and a SVM to merge the results from different models. We evaluated the effectiveness of our method on a manually labeled dataset, finding that our combined approach captured around 97% of AI-related articles in the web of science (WoS) corpus with a precision of 0.92. This presents a 0.15 increase in F1 score compared with existing search term based approach. Following this, we analyzed the publication volume trends and common research themes.We found that compared with existing methods, our ensemble approach revealed an increased degree of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Computational and Text Analysis Methods