MMBERT: Scaled Mixture-of-Experts Multimodal BERT for Robust Chinese Hate Speech Detection under Cloaking Perturbations
Qiyao Xue, Yuchen Dou, Ryan Shi, Xiang Lorraine Li, Wei Gao

TL;DR
This paper introduces MMBERT, a multimodal BERT-based model with a Mixture-of-Experts architecture, designed to improve robustness in Chinese hate speech detection, especially against cloaking techniques, by integrating text, speech, and visual data.
Contribution
The paper presents a novel multimodal framework with a three-stage training process and expert routing, enhancing robustness against adversarial cloaking in Chinese hate speech detection.
Findings
MMBERT outperforms existing BERT-based and LLM models on Chinese hate speech datasets.
The proposed model demonstrates increased robustness against adversarial perturbations.
Empirical results confirm the effectiveness of the multimodal and MoE architecture.
Abstract
Hate speech detection on Chinese social networks presents distinct challenges, particularly due to the widespread use of cloaking techniques designed to evade conventional text-based detection systems. Although large language models (LLMs) have recently improved hate speech detection capabilities, the majority of existing work has concentrated on English datasets, with limited attention given to multimodal strategies in the Chinese context. In this study, we propose MMBERT, a novel BERT-based multimodal framework that integrates textual, speech, and visual modalities through a Mixture-of-Experts (MoE) architecture. To address the instability associated with directly integrating MoE into BERT-based models, we develop a progressive three-stage training paradigm. MMBERT incorporates modality-specific experts, a shared self-attention mechanism, and a router-based expert allocation strategy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Emotion and Mood Recognition · Sentiment Analysis and Opinion Mining
