Make Satire Boring Again: Reducing Stylistic Bias of Satirical Corpus by   Utilizing Generative LLMs

Asli Umay Ozturk; Recep Firat Cekinel; Pinar Karagoz

arXiv:2412.09247·cs.CL·December 13, 2024

Make Satire Boring Again: Reducing Stylistic Bias of Satirical Corpus by Utilizing Generative LLMs

Asli Umay Ozturk, Recep Firat Cekinel, Pinar Karagoz

PDF

Open Access 1 Repo

TL;DR

This paper introduces a debiasing method using generative LLMs to improve satire detection models' robustness and generalizability across domains and languages, addressing stylistic bias in satirical corpora.

Contribution

It proposes a novel debiasing approach leveraging generative large language models and introduces the Turkish Satirical News Dataset with detailed annotations.

Findings

01

Debiasing improves model robustness in cross-domain and cross-lingual satire detection.

02

The approach enhances generalizability for irony detection in Turkish and English.

03

Limited impact observed on causal language models like Llama-3.1.

Abstract

Satire detection is essential for accurately extracting opinions from textual data and combating misinformation online. However, the lack of diverse corpora for satire leads to the problem of stylistic bias which impacts the models' detection performances. This study proposes a debiasing approach for satire detection, focusing on reducing biases in training data by utilizing generative large language models. The approach is evaluated in both cross-domain (irony detection) and cross-lingual (English) settings. Results show that the debiasing method enhances the robustness and generalizability of the models for satire and irony detection tasks in Turkish and English. However, its impact on causal language models, such as Llama-3.1, is limited. Additionally, this work curates and presents the Turkish Satirical News Dataset with detailed human annotations, with case studies on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

auotomaton/satiretr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLanguage, Metaphor, and Cognition · Translation Studies and Practices · Swearing, Euphemism, Multilingualism