Paperswithtopic: Topic Identification from Paper Title Only
Daehyun Cho, Christian Wallraven

TL;DR
This paper introduces a dataset and evaluates various machine learning models, including transformers, for classifying AI paper titles into sub-fields, with an emphasis on interpretability through attention visualizations.
Contribution
It provides a new annotated dataset of AI paper titles with sub-field labels and compares multiple classification algorithms, including transformer models, for short-text classification.
Findings
Transformer models outperform traditional algorithms in accuracy.
Attention visualizations help interpret model decisions.
The dataset enables future research in automated paper classification.
Abstract
The deep learning field is growing rapidly as witnessed by the exponential growth of papers submitted to journals, conferences, and pre-print servers. To cope with the sheer number of papers, several text mining tools from natural language processing (NLP) have been proposed that enable researchers to keep track of recent findings. In this context, our paper makes two main contributions: first, we collected and annotated a dataset of papers paired by title and sub-field from the field of artificial intelligence (AI), and, second, we present results on how to predict a paper's AI sub-field from a given paper title only. Importantly, for the latter, short-text classification task we compare several algorithms from conventional machine learning all the way up to recent, larger transformer architectures. Finally, for the transformer models, we also present gradient-based, attention…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Topic Modeling · Biomedical Text Mining and Ontologies
