SuperTweetEval: A Challenging, Unified and Heterogeneous Benchmark for   Social Media NLP Research

Dimosthenis Antypas; Asahi Ushio; Francesco Barbieri; Leonardo Neves,; Kiamehr Rezaee; Luis Espinosa-Anke; Jiaxin Pei; Jose Camacho-Collados

arXiv:2310.14757·cs.CL·October 24, 2023·1 cites

SuperTweetEval: A Challenging, Unified and Heterogeneous Benchmark for Social Media NLP Research

Dimosthenis Antypas, Asahi Ushio, Francesco Barbieri, Leonardo Neves,, Kiamehr Rezaee, Luis Espinosa-Anke, Jiaxin Pei, Jose Camacho-Collados

PDF

Open Access 10 Models 1 Datasets

TL;DR

SuperTweetEval is a comprehensive benchmark designed to unify and evaluate NLP models across diverse social media tasks, highlighting ongoing challenges despite recent advances in language modeling.

Contribution

The paper introduces SuperTweetEval, a novel unified benchmark for social media NLP, combining multiple tasks and datasets to facilitate comprehensive model evaluation.

Findings

01

Social media NLP remains challenging despite recent language model improvements.

02

Benchmarking shows varied model performance across different social media tasks.

03

SuperTweetEval enables fair comparison of models on heterogeneous social media datasets.

Abstract

Despite its relevance, the maturity of NLP for social media pales in comparison with general-purpose models, metrics and benchmarks. This fragmented landscape makes it hard for the community to know, for instance, given a task, which is the best performing model and how it compares with others. To alleviate this issue, we introduce a unified benchmark for NLP evaluation in social media, SuperTweetEval, which includes a heterogeneous set of tasks and datasets combined, adapted and constructed from scratch. We benchmarked the performance of a wide range of models on SuperTweetEval and our results suggest that, despite the recent advances in language modelling, social media remains challenging.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Datasets

cardiffnlp/super_tweeteval
dataset· 1.2k dl
1.2k dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Wikis in Education and Collaboration

MethodsSparse Evolutionary Training