HMS-BERT: Hybrid Multi-Task Self-Training for Multilingual and Multi-Label Cyberbullying Detection

Zixin Feng; Xinying Cui; Yifan Sun; Zheng Wei; Jiachen Yuan; Jiazhen Hu; Ning Xin; Md Maruf Hasan

arXiv:2603.12920·cs.CL·March 16, 2026

HMS-BERT: Hybrid Multi-Task Self-Training for Multilingual and Multi-Label Cyberbullying Detection

Zixin Feng, Xinying Cui, Yifan Sun, Zheng Wei, Jiachen Yuan, Jiazhen Hu, Ning Xin, Md Maruf Hasan

PDF

Open Access

TL;DR

HMS-BERT is a novel hybrid multi-task self-training framework that enhances multilingual and multi-label cyberbullying detection by integrating contextual and linguistic features with iterative self-training, achieving state-of-the-art results.

Contribution

The paper introduces HMS-BERT, combining multi-task learning with self-training and linguistic features for improved multilingual, multi-label cyberbullying detection.

Findings

01

Achieves macro F1-score up to 0.9847 on multi-label detection

02

Attains 0.6775 accuracy on main classification task

03

Effective cross-lingual knowledge transfer in low-resource languages

Abstract

Cyberbullying on social media is inherently multilingual and multi-faceted, where abusive behaviors often overlap across multiple categories. Existing methods are commonly limited by monolingual assumptions or single-task formulations, which restrict their effectiveness in realistic multilingual and multi-label scenarios. In this paper, we propose HMS-BERT, a hybrid multi-task self-training framework for multilingual and multi-label cyberbullying detection. Built upon a pretrained multilingual BERT backbone, HMS-BERT integrates contextual representations with handcrafted linguistic features and jointly optimizes a fine-grained multi-label abuse classification task and a three-class main classification task. To address labeled data scarcity in low-resource languages, an iterative self-training strategy with confidence-based pseudo-labeling is introduced to facilitate cross-lingual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Bullying, Victimization, and Aggression · Authorship Attribution and Profiling