A Benchmark Study of Contrastive Learning for Arabic Social Meaning

Md Tawkat Islam Khondaker; El Moatez Billah Nagoudi; AbdelRahim; Elmadany; Muhammad Abdul-Mageed; Laks V.S. Lakshmanan

arXiv:2210.12314·cs.CL·October 25, 2022·1 cites

A Benchmark Study of Contrastive Learning for Arabic Social Meaning

Md Tawkat Islam Khondaker, El Moatez Billah Nagoudi, AbdelRahim, Elmadany, Muhammad Abdul-Mageed, Laks V.S. Lakshmanan

PDF

Open Access 1 Repo

TL;DR

This paper conducts a comprehensive benchmark study of contrastive learning methods applied to Arabic social meaning NLP tasks, demonstrating their effectiveness and data efficiency, especially in low-resource settings.

Contribution

It is the first extensive evaluation of contrastive learning for Arabic social meaning tasks, showing its advantages over traditional finetuning methods.

Findings

01

CL outperforms vanilla finetuning on most tasks

02

CL is data efficient, especially in low-resource scenarios

03

The study provides empirical evidence of CL's promise for Arabic NLP

Abstract

Contrastive learning (CL) brought significant progress to various NLP tasks. Despite this progress, CL has not been applied to Arabic NLP to date. Nor is it clear how much benefits it could bring to particular classes of tasks such as those involved in Arabic social meaning (e.g., sentiment analysis, dialect identification, hate speech detection). In this work, we present a comprehensive benchmark study of state-of-the-art supervised CL methods on a wide array of Arabic social meaning tasks. Through extensive empirical analyses, we show that CL methods outperform vanilla finetuning on most tasks we consider. We also show that CL can be data efficient and quantify this efficiency. Overall, our work allows us to demonstrate the promise of CL methods, including in low-resource settings.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Tawkat/Arabic-CL-Benchmark
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText and Document Classification Technologies · Sentiment Analysis and Opinion Mining · Hate Speech and Cyberbullying Detection