Distributionally Robust Classifiers in Sentiment Analysis
Shilun Li, Renee Li, Carina Zhang

TL;DR
This paper introduces a distributionally robust sentiment classifier based on BERT with additional layers, designed to maintain performance under dataset shifts, and demonstrates its effectiveness on IMDb and Rotten Tomatoes datasets.
Contribution
The paper presents a novel BERT-based sentiment classifier with integrated DRO and extra layers to enhance robustness against distributional shifts.
Findings
DRO model improves performance under dataset shift
Additional layers contribute to robustness
Effective on IMDb to Rotten Tomatoes shift
Abstract
In this paper, we propose sentiment classification models based on BERT integrated with DRO (Distributionally Robust Classifiers) to improve model performance on datasets with distributional shifts. We added 2-Layer Bi-LSTM, projection layer (onto simplex or Lp ball), and linear layer on top of BERT to achieve distributionally robustness. We considered one form of distributional shift (from IMDb dataset to Rotten Tomatoes dataset). We have confirmed through experiments that our DRO model does improve performance on our test set with distributional shift from the training set.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Topic Modeling · Text and Document Classification Technologies
MethodsMulti-Head Attention · Attention Is All You Need · Test · Layer Normalization · Softmax · Weight Decay · Residual Connection · Linear Warmup With Linear Decay · WordPiece · Attention Dropout
