A comprehensive cross-language framework for harmful content detection   with the aid of sentiment analysis

Mohammad Dehghani

arXiv:2403.01270·cs.CL·March 5, 2024·2 cites

A comprehensive cross-language framework for harmful content detection with the aid of sentiment analysis

Mohammad Dehghani

PDF

Open Access

TL;DR

This paper introduces a universal, detailed framework for harmful content detection across languages, integrating sentiment analysis and demonstrating high accuracy on a Persian dataset with machine learning methods.

Contribution

It presents the first comprehensive, adaptable framework for harmful language detection that includes detailed annotation guidelines and combines sentiment analysis, addressing existing limitations.

Findings

01

Achieved 99.4% accuracy in offensive language detection

02

Developed a Persian dataset with detailed annotations

03

Demonstrated the framework's effectiveness in a low-resource language

Abstract

In today's digital world, social media plays a significant role in facilitating communication and content sharing. However, the exponential rise in user-generated content has led to challenges in maintaining a respectful online environment. In some cases, users have taken advantage of anonymity in order to use harmful language, which can negatively affect the user experience and pose serious social problems. Recognizing the limitations of manual moderation, automatic detection systems have been developed to tackle this problem. Nevertheless, several obstacles persist, including the absence of a universal definition for harmful language, inadequate datasets across languages, the need for detailed annotation guideline, and most importantly, a comprehensive framework. This study aims to address these challenges by introducing, for the first time, a detailed framework adaptable to any…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Advanced Malware Detection Techniques · Digital and Cyber Forensics

MethodsSparse Evolutionary Training