TL;DR
This paper explores how BERT's self-attention mechanism highlights domain-specific words in scientific articles and compares it to traditional feature selection methods, revealing that attention focuses on relevant terms but classification relies on contextualized outputs.
Contribution
The study characterizes self-attention as a feature selection method in scientific text classification and compares its effectiveness with conventional feature selection techniques.
Findings
Self-attention focuses on domain-related words in scientific articles.
Traditional feature selection outperforms self-attention in classification accuracy.
Attended words are more related to research fields according to ConceptNet.
Abstract
We investigate the self-attention mechanism of BERT in a fine-tuning scenario for the classification of scientific articles over a taxonomy of research disciplines. We observe how self-attention focuses on words that are highly related to the domain of the article. Particularly, a small subset of vocabulary words tends to receive most of the attention. We compare and evaluate the subset of the most attended words with feature selection methods normally used for text classification in order to characterize self-attention as a possible feature selection approach. Using ConceptNet as ground truth, we also find that attended words are more related to the research fields of the articles. However, conventional feature selection methods are still a better option to learn classifiers from scratch. This result suggests that, while self-attention identifies domain-relevant terms, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Feature Selection · Dense Connections · Residual Connection · Adam · Linear Warmup With Linear Decay · Dropout · Softmax · Multi-Head Attention · Refunds@Expedia|||How do I get a full refund from Expedia?
