Nullpointer at ArAIEval Shared Task: Arabic Propagandist Technique   Detection with Token-to-Word Mapping in Sequence Tagging

Abrar Abir; Kemal Oflazer

arXiv:2407.01360·cs.CL·July 2, 2024

Nullpointer at ArAIEval Shared Task: Arabic Propagandist Technique Detection with Token-to-Word Mapping in Sequence Tagging

Abrar Abir, Kemal Oflazer

PDF

Open Access

TL;DR

This paper presents an optimized Arabic propaganda technique detection method using AraBERT v2 with token-to-word mapping and genre features, achieving top leaderboard performance in the ArAIEval shared task.

Contribution

It introduces a novel sequence tagging approach with token-to-word mapping and genre features for improved propaganda detection in Arabic texts.

Findings

01

First token reliance yields best results

02

Genre features improve model performance

03

Achieved 4th place with a score of 25.41, later improved to 26.68

Abstract

This paper investigates the optimization of propaganda technique detection in Arabic text, including tweets \& news paragraphs, from ArAIEval shared task 1. Our approach involves fine-tuning the AraBERT v2 model with a neural network classifier for sequence tagging. Experimental results show relying on the first token of the word for technique prediction produces the best performance. In addition, incorporating genre information as a feature further enhances the model's performance. Our system achieved a score of 25.41, placing us 4 $^{t h}$ on the leaderboard. Subsequent post-submission improvements further raised our score to 26.68.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification