Cohesion-6K: An Arabic Dataset for Analyzing Social Cohesion and Conflict in Online Discourse

Aisha Ali Al-Athba; Wajdi Zaghouani

arXiv:2605.22447·cs.CL·May 22, 2026

Cohesion-6K: An Arabic Dataset for Analyzing Social Cohesion and Conflict in Online Discourse

Aisha Ali Al-Athba, Wajdi Zaghouani

PDF

TL;DR

Cohesion-6K is a new Arabic Facebook dataset with annotated discourse categories from conflict to cohesion, enabling analysis of social cohesion and polarization in online discourse.

Contribution

The paper introduces a manually and ChatGPT-assisted annotated dataset of 6,000 Arabic posts with discourse labels, enhancing computational social science research.

Findings

01

Conflict posts attract 2-4 times more engagement than resolution posts

02

Annotation process achieved Cohen's kappa of 0.85 indicating high agreement

03

Dataset supports future research in social cohesion, polarization, and Arabic NLP

Abstract

The study of online discourse has become central to understanding societal polarization. While much research has focused on detecting overt toxicity, the subtle dynamics of social cohesion, meaning the interaction between divisive and unifying narratives, remain computationally underexplored (Bail, 2021; Gonzalez-Bailon and Lelkes, 2023). This paper presents Cohesion-6K, a manually and ChatGPT-assisted annotated dataset of six thousand Arabic public Facebook posts related to the Israeli Occupation of Palestine. Each post is assigned to one of five discourse categories that represent a continuum from conflict to cohesion: Conflict, Resolution, Community Engagement, Supportive Interactions, and Shared Values. The annotation process combines expert human judgment with model-assisted pre-labeling verified by trained annotators, achieving substantial inter-annotator agreement (Cohens kappa =…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.