Advancing Harmful Content Detection in Organizational Research: Integrating Large Language Models with Elo Rating System

Mustafa Akben; Aaron Satko

arXiv:2506.16575·cs.AI·June 23, 2025

Advancing Harmful Content Detection in Organizational Research: Integrating Large Language Models with Elo Rating System

Mustafa Akben, Aaron Satko

PDF

Open Access

TL;DR

This paper presents an Elo rating-based method that enhances large language models' ability to detect harmful content in organizational research, outperforming traditional techniques in accuracy and scalability.

Contribution

The paper introduces a novel Elo rating system integrated with LLMs to improve harmful content detection in organizational datasets, addressing moderation limitations.

Findings

01

Outperforms traditional prompting techniques in accuracy and F1 scores

02

Reduces false positives in harmful content detection

03

Enhances scalability for large datasets

Abstract

Large language models (LLMs) offer promising opportunities for organizational research. However, their built-in moderation systems can create problems when researchers try to analyze harmful content, often refusing to follow certain instructions or producing overly cautious responses that undermine validity of the results. This is particularly problematic when analyzing organizational conflicts such as microaggressions or hate speech. This paper introduces an Elo rating-based method that significantly improves LLM performance for harmful content analysis In two datasets, one focused on microaggression detection and the other on hate speech, we find that our method outperforms traditional LLM prompting techniques and conventional machine learning models on key measures such as accuracy, precision, and F1 scores. Advantages include better reliability when analyzing harmful content, fewer…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Computational and Text Analysis Methods · Topic Modeling