Nullpointer at ArAIEval Shared Task: Arabic Propagandist Technique Detection with Token-to-Word Mapping in Sequence Tagging
Abrar Abir, Kemal Oflazer

TL;DR
This paper presents an optimized Arabic propaganda technique detection method using AraBERT v2 with token-to-word mapping and genre features, achieving top leaderboard performance in the ArAIEval shared task.
Contribution
It introduces a novel sequence tagging approach with token-to-word mapping and genre features for improved propaganda detection in Arabic texts.
Findings
First token reliance yields best results
Genre features improve model performance
Achieved 4th place with a score of 25.41, later improved to 26.68
Abstract
This paper investigates the optimization of propaganda technique detection in Arabic text, including tweets \& news paragraphs, from ArAIEval shared task 1. Our approach involves fine-tuning the AraBERT v2 model with a neural network classifier for sequence tagging. Experimental results show relying on the first token of the word for technique prediction produces the best performance. In addition, incorporating genre information as a feature further enhances the model's performance. Our system achieved a score of 25.41, placing us 4 on the leaderboard. Subsequent post-submission improvements further raised our score to 26.68.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
