An N-gram based approach to auto-extracting topics from research articles
Linkai Zhu, Maoyi Huang, Maomao Chen, Wennan Wang

TL;DR
This paper presents an N-gram based automated method for extracting research article topics, focusing on efficiency and relevance filtering, tested within the autonomous vehicle domain to reduce manual effort.
Contribution
The study introduces a novel N-gram analysis approach combined with custom filtering standards for automatic topic extraction from research articles.
Findings
Effective automatic topic extraction demonstrated in autonomous vehicle research articles
Comparison shows high similarity between automated and manual topic identification
Method improves efficiency in processing large volumes of research papers
Abstract
A lot of manual work goes into identifying a topic for an article. With a large volume of articles, the manual process can be exhausting. Our approach aims to address this issue by automatically extracting topics from the text of large Numbers of articles. This approach takes into account the efficiency of the process. Based on existing N-gram analysis, our research examines how often certain words appear in documents in order to support automatic topic extraction. In order to improve efficiency, we apply custom filtering standards to our research. Additionally, delete as many noncritical or irrelevant phrases as possible. In this way, we can ensure we are selecting unique keyphrases for each article, which capture its core idea. For our research, we chose to center on the autonomous vehicle domain, since the research is relevant to our daily lives. We have to convert the PDF versions…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Natural Language Processing Techniques · Topic Modeling
