TIB-VA at SemEval-2022 Task 5: A Multimodal Architecture for the Detection and Classification of Misogynous Memes
Sherzod Hakimov, Gullal S. Cheema, Ralph Ewerth

TL;DR
This paper introduces a multimodal neural architecture for detecting and classifying misogynous memes by combining textual and visual features, achieving top results in the SemEval-2022 challenge.
Contribution
It presents a novel multimodal approach specifically designed for misogyny detection in memes, outperforming existing methods in a competitive benchmark.
Findings
Achieved the best performance in misogyny classification task
Effectively combined textual and visual features for meme analysis
Demonstrated the approach's robustness across multiple misogyny sub-classes
Abstract
The detection of offensive, hateful content on social media is a challenging problem that affects many online users on a daily basis. Hateful content is often used to target a group of people based on ethnicity, gender, religion and other factors. The hate or contempt toward women has been increasing on social platforms. Misogynous content detection is especially challenging when textual and visual modalities are combined to form a single context, e.g., an overlay text embedded on top of an image, also known as meme. In this paper, we present a multimodal architecture that combines textual and visual features in order to detect misogynous meme content. The proposed architecture is evaluated in the SemEval-2022 Task 5: MAMI - Multimedia Automatic Misogyny Identification challenge under the team name TIB-VA. Our solution obtained the best result in the Task-B where the challenge is to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection
