Cordyceps@LT-EDI: Patching Language-Specific Homophobia/Transphobia Classifiers with a Multilingual Understanding
Dean Ninalga

TL;DR
This paper introduces a combined multilingual and language-specific approach to detect homophobia and transphobia in social media, improving accuracy across multiple languages by merging models through interpretable weight interpolation.
Contribution
It presents a novel method for merging multilingual and language-specific classifiers via weight interpolation, enhancing hate speech detection across diverse languages.
Findings
Achieved best results in three of five languages
Attained a 0.997 macro F1-score on Malayalam texts
Demonstrated effectiveness of combined models in real datasets
Abstract
Detecting transphobia, homophobia, and various other forms of hate speech is difficult. Signals can vary depending on factors such as language, culture, geographical region, and the particular online platform. Here, we present a joint multilingual (M-L) and language-specific (L-S) approach to homophobia and transphobic hate speech detection (HSD). M-L models are needed to catch words, phrases, and concepts that are less common or missing in a particular language and subsequently overlooked by L-S models. Nonetheless, L-S models are better situated to understand the cultural and linguistic context of the users who typically write in a particular language. Here we construct a simple and successful way to merge the M-L and L-S approaches through simple weight interpolation in such a way that is interpretable and data-driven. We demonstrate our system on task A of the 'Shared Task on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection
