PclGPT: A Large Language Model for Patronizing and Condescending   Language Detection

Hongbo Wang; Mingda Li; Junyu Lu; Hebin Xia; Liang Yang; and Bo Xu; Ruizhu Liu; Hongfei Lin

arXiv:2410.00361·cs.CL·October 2, 2024

PclGPT: A Large Language Model for Patronizing and Condescending Language Detection

Hongbo Wang, Mingda Li, Junyu Lu, Hebin Xia, Liang Yang, and Bo Xu, Ruizhu Liu, Hongfei Lin

PDF

Open Access 1 Repo 2 Models 2 Datasets

TL;DR

This paper introduces PclGPT, a large language model benchmark specifically designed to detect patronizing and condescending language, addressing the limitations of traditional models in identifying implicit toxicity.

Contribution

The paper develops PclGPT, a novel bilingual LLM benchmark with a specialized dataset and training process for implicit toxic language detection, highlighting biases across vulnerable groups.

Findings

01

PclGPT outperforms traditional models in detecting PCL.

02

Significant bias variations in PCL towards different groups.

03

Enhanced societal awareness of implicit toxicity issues.

Abstract

Disclaimer: Samples in this paper may be harmful and cause discomfort! Patronizing and condescending language (PCL) is a form of speech directed at vulnerable groups. As an essential branch of toxic language, this type of language exacerbates conflicts and confrontations among Internet communities and detrimentally impacts disadvantaged groups. Traditional pre-trained language models (PLMs) perform poorly in detecting PCL due to its implicit toxicity traits like hypocrisy and false sympathy. With the rise of large language models (LLMs), we can harness their rich emotional semantics to establish a paradigm for exploring implicit toxicity. In this paper, we introduce PclGPT, a comprehensive LLM benchmark designed specifically for PCL. We collect, annotate, and integrate the Pcl-PT/SFT dataset, and then develop a bilingual PclGPT-EN/CN model group through a comprehensive pre-training…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dut-laowang/emnlp24-PclGPT
noneOfficial

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Authorship Attribution and Profiling · Translation Studies and Practices

MethodsSoftmax · Attention Is All You Need