AI vs. Human Moderators: A Comparative Evaluation of Multimodal LLMs in Content Moderation for Brand Safety
Adi Levi, Or Levi, Sardhendu Mishra, Jonathan Morra

TL;DR
This paper evaluates multimodal large language models for content moderation, specifically brand safety classification, comparing their performance and cost efficiency to human reviewers using a new multilingual dataset.
Contribution
It introduces a novel multilingual, multimodal dataset for brand safety and benchmarks MLLMs against human moderators, highlighting their strengths and limitations.
Findings
MLLMs like Gemini, GPT, and Llama perform effectively in brand safety tasks.
MLLMs are more cost-efficient than human reviewers.
Limitations and failure cases of MLLMs are discussed.
Abstract
As the volume of video content online grows exponentially, the demand for moderation of unsafe videos has surpassed human capabilities, posing both operational and mental health challenges. While recent studies demonstrated the merits of Multimodal Large Language Models (MLLMs) in various video understanding tasks, their application to multimodal content moderation, a domain that requires nuanced understanding of both visual and textual cues, remains relatively underexplored. In this work, we benchmark the capabilities of MLLMs in brand safety classification, a critical subset of content moderation for safe-guarding advertising integrity. To this end, we introduce a novel, multimodal and multilingual dataset, meticulously labeled by professional reviewers in a multitude of risk categories. Through a detailed comparative analysis, we demonstrate the effectiveness of MLLMs such as Gemini,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
