TL;DR
BERTweet is a large-scale pre-trained language model specifically designed for English Tweets, outperforming existing models on key Tweet NLP tasks and facilitating future research in social media text analysis.
Contribution
It introduces BERTweet, the first large-scale pre-trained language model tailored for English Tweets, using RoBERTa training and achieving state-of-the-art results.
Findings
BERTweet outperforms RoBERTa-base and XLM-R-base on Tweet NLP tasks.
Achieves better performance on POS tagging, NER, and text classification.
Open-sourced for research and application development.
Abstract
We present BERTweet, the first public large-scale pre-trained language model for English Tweets. Our BERTweet, having the same architecture as BERT-base (Devlin et al., 2019), is trained using the RoBERTa pre-training procedure (Liu et al., 2019). Experiments show that BERTweet outperforms strong baselines RoBERTa-base and XLM-R-base (Conneau et al., 2020), producing better performance results than the previous state-of-the-art models on three Tweet NLP tasks: Part-of-speech tagging, Named-entity recognition and text classification. We release BERTweet under the MIT License to facilitate future research and applications on Tweet data. Our BERTweet is available at https://github.com/VinAIResearch/BERTweet
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Weight Decay · Softmax · Adam · Multi-Head Attention · Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Attention Dropout · Linear Warmup With Linear Decay · Dense Connections
