Multi-Agent VLMs Guided Self-Training with PNU Loss for Low-Resource Offensive Content Detection

Han Wang; Deyi Ji; Junyu Lu; Lanyun Zhu; Hailong Zhang; Haiyang Wu; Liqun Liu; Peng Shu; Roy Ka-Wei Lee

arXiv:2511.13759·cs.LG·November 19, 2025

Multi-Agent VLMs Guided Self-Training with PNU Loss for Low-Resource Offensive Content Detection

Han Wang, Deyi Ji, Junyu Lu, Lanyun Zhu, Hailong Zhang, Haiyang Wu, Liqun Liu, Peng Shu, Roy Ka-Wei Lee

PDF

Open Access

TL;DR

This paper introduces a self-training framework using multi-agent vision-language models and a PNU loss to improve offensive content detection in low-resource settings, effectively leveraging unlabeled data.

Contribution

It proposes a novel collaborative pseudo-labeling approach with multi-agent models and a PNU loss for low-resource offensive content detection.

Findings

01

Outperforms baseline methods with limited labeled data

02

Approaches the performance of large-scale models

03

Effectively leverages unlabeled data through collaborative pseudo-labeling

Abstract

Accurate detection of offensive content on social media demands high-quality labeled data; however, such data is often scarce due to the low prevalence of offensive instances and the high cost of manual annotation. To address this low-resource challenge, we propose a self-training framework that leverages abundant unlabeled data through collaborative pseudo-labeling. Starting with a lightweight classifier trained on limited labeled data, our method iteratively assigns pseudo-labels to unlabeled instances with the support of Multi-Agent Vision-Language Models (MA-VLMs). Un-labeled data on which the classifier and MA-VLMs agree are designated as the Agreed-Unknown set, while conflicting samples form the Disagreed-Unknown set. To enhance label reliability, MA-VLMs simulate dual perspectives, moderator and user, capturing both regulatory and subjective viewpoints. The classifier is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Spam and Phishing Detection · Misinformation and Its Impacts