SuperTweetEval: A Challenging, Unified and Heterogeneous Benchmark for Social Media NLP Research
Dimosthenis Antypas, Asahi Ushio, Francesco Barbieri, Leonardo Neves,, Kiamehr Rezaee, Luis Espinosa-Anke, Jiaxin Pei, Jose Camacho-Collados

TL;DR
SuperTweetEval is a comprehensive benchmark designed to unify and evaluate NLP models across diverse social media tasks, highlighting ongoing challenges despite recent advances in language modeling.
Contribution
The paper introduces SuperTweetEval, a novel unified benchmark for social media NLP, combining multiple tasks and datasets to facilitate comprehensive model evaluation.
Findings
Social media NLP remains challenging despite recent language model improvements.
Benchmarking shows varied model performance across different social media tasks.
SuperTweetEval enables fair comparison of models on heterogeneous social media datasets.
Abstract
Despite its relevance, the maturity of NLP for social media pales in comparison with general-purpose models, metrics and benchmarks. This fragmented landscape makes it hard for the community to know, for instance, given a task, which is the best performing model and how it compares with others. To alleviate this issue, we introduce a unified benchmark for NLP evaluation in social media, SuperTweetEval, which includes a heterogeneous set of tasks and datasets combined, adapted and constructed from scratch. We benchmarked the performance of a wide range of models on SuperTweetEval and our results suggest that, despite the recent advances in language modelling, social media remains challenging.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗cardiffnlp/twitter-roberta-base-tempo-wic-latestmodel· 8 dl8 dl
- 🤗cardiffnlp/twitter-roberta-base-emotion-latestmodel· 396 dl· ♡ 5396 dl♡ 5
- 🤗cardiffnlp/twitter-roberta-base-emoji-latestmodel· 17 dl· ♡ 417 dl♡ 4
- 🤗cardiffnlp/twitter-roberta-base-hate-latest-stmodel· 24 dl24 dl
- 🤗cardiffnlp/twitter-roberta-base-topic-sentiment-latestmodel· 91 dl· ♡ 591 dl♡ 5
- 🤗cardiffnlp/twitter-roberta-base-intimacy-latestmodel· 16 dl16 dl
- 🤗cardiffnlp/twitter-roberta-base-ner7-latestmodel· 114 dl114 dl
- 🤗cardiffnlp/twitter-roberta-base-similarity-latestmodel· 8 dl8 dl
- 🤗cardiffnlp/twitter-roberta-base-topic-latestmodel· 12 dl12 dl
- 🤗cardiffnlp/twitter-roberta-base-nerd-latestmodel· 6 dl6 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Wikis in Education and Collaboration
MethodsSparse Evolutionary Training
