ParsCN: A Persian Dataset for Counter-Narrative Generation to Combat Online Hate Speech
Zahra Safdari Fesaghandis, Suman Kalyan Maity

TL;DR
ParsCN is the first comprehensive Persian dataset for counter-narrative generation, enabling research to combat online hate speech in low-resource, culturally-specific contexts.
Contribution
Introduces ParsCN, a high-quality Persian hate speech and counter-narrative dataset, and proposes a scalable, culturally-informed generation framework for low-resource languages.
Findings
Human counter-narratives scored highest in relevance and fluency.
Automatic evaluations show high semantic alignment and low toxicity.
Benchmark results highlight challenges for existing models in Persian context.
Abstract
Online hate speech threatens online civility, particularly in low-resource and multilingual environments. Counter-narratives offer a promising solution by promoting constructive responses to hate speech. However, automatic counter-narrative generation is hindered by the lack of high-quality data for low-resource languages like Persian. To bridge this gap, we introduce ParsCN, the first and most comprehensive Persian counter-narrative dataset. Consisting of 1,100 hate speech and counter-narrative pairs, it provides fine-grained annotations across six target groups and six countering strategies, tailored to the socio-cultural context of Persian online discourse. We propose a novel, scalable multi-stage framework that integrates culturally-informed human annotation with few-shot LLM-augmented generation, guided by semantic retrieval and rigorous manual curation. This approach enables the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
