Does BERT look at sentiment lexicon?
Elena Razova, Sergey Vychegzhanin, Evgeny Kotelnikov

TL;DR
This study investigates whether BERT models, specifically RuBERT, attend more to sentiment lexicons than neutral words, revealing that a majority of attention heads focus more on sentiment-related tokens, aiding interpretability.
Contribution
The paper provides an analysis of attention weights in RuBERT, showing that most heads prioritize sentiment lexicons over neutral words, enhancing understanding of neural network interpretability in sentiment analysis.
Findings
Approximately 75% of attention heads focus more on sentiment lexicons.
Fine-tuning RuBERT reveals attention distribution differences for sentiment vs. neutral words.
Most attention heads statistically prioritize sentiment lexicons.
Abstract
The main approaches to sentiment analysis are rule-based methods and ma-chine learning, in particular, deep neural network models with the Trans-former architecture, including BERT. The performance of neural network models in the tasks of sentiment analysis is superior to the performance of rule-based methods. The reasons for this situation remain unclear due to the poor interpretability of deep neural network models. One of the main keys to understanding the fundamental differences between the two approaches is the analysis of how sentiment lexicon is taken into account in neural network models. To this end, we study the attention weights matrices of the Russian-language RuBERT model. We fine-tune RuBERT on sentiment text corpora and compare the distributions of attention weights for sentiment and neutral lexicons. It turns out that, on average, 3/4 of the heads of various model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Topic Modeling
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Attention Dropout · Weight Decay · Adam · Linear Warmup With Linear Decay · Residual Connection · WordPiece
