BERT-based Multi-Task Model for Country and Province Level Modern   Standard Arabic and Dialectal Arabic Identification

Abdellah El Mekki; Abdelkader El Mahdaouy; Kabil Essefar; Nabil El; Mamoun; Ismail Berrada; Ahmed Khoumsi

arXiv:2106.12495·cs.CL·June 24, 2021

BERT-based Multi-Task Model for Country and Province Level Modern Standard Arabic and Dialectal Arabic Identification

Abdellah El Mekki, Abdelkader El Mahdaouy, Kabil Essefar, Nabil El, Mamoun, Ismail Berrada, Ahmed Khoumsi

PDF

Open Access

TL;DR

This paper introduces a BERT-based multi-task learning model for identifying country and province levels of Modern Standard Arabic and Dialectal Arabic, improving accuracy over single-task approaches.

Contribution

The paper presents a novel end-to-end deep multi-task learning model leveraging shared BERT encoders and task-specific attention layers for Arabic dialect and standard language identification.

Findings

01

MTL model outperforms single-task models on most subtasks

02

Shared BERT encoder effectively captures inter-task features

03

Model achieves higher accuracy in country and province level identification

Abstract

Dialect and standard language identification are crucial tasks for many Arabic natural language processing applications. In this paper, we present our deep learning-based system, submitted to the second NADI shared task for country-level and province-level identification of Modern Standard Arabic (MSA) and Dialectal Arabic (DA). The system is based on an end-to-end deep Multi-Task Learning (MTL) model to tackle both country-level and province-level MSA/DA identification. The latter MTL model consists of a shared Bidirectional Encoder Representation Transformers (BERT) encoder, two task-specific attention layers, and two classifiers. Our key idea is to leverage both the task-discriminative and the inter-task shared features for country and province MSA/DA identification. The obtained results show that our MTL model outperforms single-task models on most subtasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Authorship Attribution and Profiling · Topic Modeling