mdok-style at SemEval-2026 Task 9: Finetuning LLMs for Multilingual Polarization Detection

Dominik Macko; Alok Debnath; Jakub Simko

arXiv:2605.02695·cs.CL·May 5, 2026

mdok-style at SemEval-2026 Task 9: Finetuning LLMs for Multilingual Polarization Detection

Dominik Macko, Alok Debnath, Jakub Simko

PDF

TL;DR

This paper describes finetuning mid-size multilingual LLMs using QLoRA for polarization detection across 22 languages, aiming to identify and mitigate online social polarization and its harmful effects.

Contribution

It introduces a novel approach of using QLoRA for efficient multilingual polarization detection with augmented data for robustness.

Findings

01

Achieved effective polarization detection across 22 languages.

02

Enhanced robustness through data augmentation techniques.

03

Demonstrated the viability of finetuning LLMs for social polarization tasks.

Abstract

SemEval-2026 Task 9 is focused on multilingual polarization detection. Specifically, it covers the identification of multilingual, multicultural and multievent polarization along three axes (in subtasks), namely detection, type, and manifestation. Online polarization presents a concern, because it is often followed by hate speech, offensive discourse, and social fragmentation. Therefore, its detection before it escalates is crucial for a safer and more inclusive online space. We have coped with this SemEval task by finetuning mid-size LLMs for the sequence-classification task using the QLoRA parameter-efficient finetuning technique. The training data augmented the multilingual (22 languages) training sets by anonymized, lower-cased, upper-cased, and homoglyphied counterparts, making the detection more robust.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.