GOAT-Bench: Safety Insights to Large Multimodal Models through Meme-Based Social Abuse
Hongzhan Lin, Ziyang Luo, Bo Wang, Ruichao Yang, Jing Ma

TL;DR
This paper introduces GOAT-Bench, a large meme dataset, to evaluate multimodal models' ability to detect implicit social abuse in memes, revealing current models' safety shortcomings.
Contribution
The paper presents GOAT-Bench, a comprehensive meme benchmark with over 6,000 memes, and evaluates large multimodal models' capacity to recognize implicit social abuse, highlighting their safety limitations.
Findings
Current models show insensitivity to implicit abuse in memes.
GOAT-Bench enables systematic evaluation of safety awareness in multimodal models.
Models need further development to improve safety and abuse detection.
Abstract
The exponential growth of social media has profoundly transformed how information is created, disseminated, and absorbed, exceeding any precedent in the digital age. Regrettably, this explosion has also spawned a significant increase in the online abuse of memes. Evaluating the negative impact of memes is notably challenging, owing to their often subtle and implicit meanings, which are not directly conveyed through the overt text and image. In light of this, large multimodal models (LMMs) have emerged as a focal point of interest due to their remarkable capabilities in handling diverse multimodal tasks. In response to this development, our paper aims to thoroughly examine the capacity of various LMMs (e.g., GPT-4o) to discern and respond to the nuanced aspects of social abuse manifested in memes. We introduce the comprehensive meme benchmark, GOAT-Bench, comprising over 6K varied memes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Sentiment Analysis and Opinion Mining · Misinformation and Its Impacts
