IITD at the WANLP 2022 Shared Task: Multilingual Multi-Granularity   Network for Propaganda Detection

Shubham Mittal; Preslav Nakov

arXiv:2210.17190·cs.CL·November 1, 2022

IITD at the WANLP 2022 Shared Task: Multilingual Multi-Granularity Network for Propaganda Detection

Shubham Mittal, Preslav Nakov

PDF

Open Access 1 Repo

TL;DR

This paper describes a multilingual system for detecting propaganda techniques in Arabic tweets, using transformer-based models, achieving second place in a shared task, and analyzing the impact of additional English data.

Contribution

The paper introduces a multi-granularity network with XLM-R and mBERT for propaganda detection in Arabic, demonstrating competitive performance and providing insights into data augmentation effects.

Findings

01

Using larger English propaganda datasets does not improve Arabic detection performance.

02

The proposed models achieved second place in both subtasks of the shared task.

03

Multilingual transformer models effectively identify propaganda techniques in Arabic tweets.

Abstract

We present our system for the two subtasks of the shared task on propaganda detection in Arabic, part of WANLP'2022. Subtask 1 is a multi-label classification problem to find the propaganda techniques used in a given tweet. Our system for this task uses XLM-R to predict probabilities for the target tweet to use each of the techniques. In addition to finding the techniques, Subtask 2 further asks to identify the textual span for each instance of each technique that is present in the tweet; the task can be modeled as a sequence tagging problem. We use a multi-granularity network with mBERT encoder for Subtask 2. Overall, our system ranks second for both subtasks (out of 14 and 3 participants, respectively). Our empirical analysis show that it does not help to use a much larger English corpus annotated with propaganda techniques, regardless of whether used in English or after translation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sm354/mmgn
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Text Readability and Simplification · Natural Language Processing Techniques

MethodsXLM-R · mBERT