MLLM-as-a-Judge for Image Safety without Human Labeling
Zhenting Wang, Shuming Hu, Shiyu Zhao, Xiaowen Lin, Felix Juefei-Xu,, Zhuowei Li, Ligong Han, Harihar Subramanyam, Li Chen, Jianfa Chen, Nan Jiang,, Lingjuan Lyu, Shiqing Ma, Dimitris N. Metaxas, Ankit Jain

TL;DR
This paper introduces a novel zero-shot approach using pre-trained Multimodal Large Language Models (MLLMs) to identify unsafe images based on safety rules, eliminating the need for human-labeled data and enabling flexible, rule-based safety assessments.
Contribution
The authors propose a new MLLM-based method that objectifies safety rules, assesses relevance, and uses chain-of-thought reasoning to improve zero-shot image safety judgment accuracy.
Findings
High effectiveness in zero-shot safety judgment tasks
Outperforms traditional fine-tuning approaches
Reduces reliance on human-labeled datasets
Abstract
Image content safety has become a significant challenge with the rise of visual media on online platforms. Meanwhile, in the age of AI-generated content (AIGC), many image generation models are capable of producing harmful content, such as images containing sexual or violent material. Thus, it becomes crucial to identify such unsafe images based on established safety rules. Pre-trained Multimodal Large Language Models (MLLMs) offer potential in this regard, given their strong pattern recognition abilities. Existing approaches typically fine-tune MLLMs with human-labeled datasets, which however brings a series of drawbacks. First, relying on human annotators to label data following intricate and detailed guidelines is both expensive and labor-intensive. Furthermore, users of safety judgment systems may need to frequently update safety rules, making fine-tuning on human-based annotation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Image Segmentation Techniques · Advanced Neural Network Applications · Brain Tumor Detection and Classification
MethodsSparse Evolutionary Training
