cantnlp@DravidianLangTech 2026: organic domain adaptation improves multi-class hope speech detection in Tulu

Andrew Li; Sidney Wong

arXiv:2605.09795·cs.CL·May 12, 2026

cantnlp@DravidianLangTech 2026: organic domain adaptation improves multi-class hope speech detection in Tulu

Andrew Li, Sidney Wong

PDF

TL;DR

This paper explores organic domain adaptation of XLM-RoBERTa for improved hope speech detection in code-mixed Tulu social media comments, showing promising results over baseline models.

Contribution

It demonstrates that organically adapting XLM-RoBERTa on Tulu social media text enhances hope speech detection in code-mixed language scenarios.

Findings

01

Organically adapted model outperformed baseline on development set.

02

Further adaptation can improve hope speech detection in Tulu.

03

Results indicate potential for better detection with more organic data.

Abstract

This paper presents our systems and results for the Hope Speech Detection in Code-Mixed Tulu Language shared task at the Sixth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages (DravidianLangTech-2026). We trained an XLM-RoBERTa-based text classification system for detecting hope speech in code-mixed Tulu social media comments. We compared this organically adapted hope speech detection model with our baseline model. On the development set, the organically adapted model outperformed the baseline system. While our submitted systems performed more modestly on the official test set, these results suggest that further adapting XLM-RoBERTa on organically collected Tulu social media text containing code-mixed and mixed-script variation can improve hope speech detection in code-mixed Tulu.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.