MemeCLIP: Leveraging CLIP Representations for Multimodal Meme   Classification

Siddhant Bikram Shah; Shuvam Shiwakoti; Maheep Chaudhary; Haohan; Wang

arXiv:2409.14703·cs.LG·October 29, 2024

MemeCLIP: Leveraging CLIP Representations for Multimodal Meme Classification

Siddhant Bikram Shah, Shuvam Shiwakoti, Maheep Chaudhary, Haohan, Wang

PDF

Open Access 1 Repo 1 Video

TL;DR

MemeCLIP leverages CLIP representations to improve multimodal meme classification across multiple linguistic aspects, introducing a new dataset and achieving superior results over existing methods.

Contribution

This paper presents MemeCLIP, a novel framework that enhances multimodal meme classification by utilizing pre-trained CLIP, along with a new dataset for LGBTQ+ Pride memes.

Findings

01

MemeCLIP outperforms previous models on benchmark datasets.

02

The new PrideMM dataset fills a gap in multimodal meme analysis.

03

MemeCLIP shows competitive zero-shot performance compared to GPT-4.

Abstract

The complexity of text-embedded images presents a formidable challenge in machine learning given the need for multimodal understanding of multiple aspects of expression conveyed by them. While previous research in multimodal analysis has primarily focused on singular aspects such as hate speech and its subclasses, this study expands this focus to encompass multiple aspects of linguistics: hate, targets of hate, stance, and humor. We introduce a novel dataset PrideMM comprising 5,063 text-embedded images associated with the LGBTQ+ Pride movement, thereby addressing a serious gap in existing resources. We conduct extensive experimentation on PrideMM by using unimodal and multimodal baseline methods to establish benchmarks for each task. Additionally, we propose a novel framework MemeCLIP for efficient downstream learning while preserving the knowledge of the pre-trained CLIP model. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

siddhantbikram/memeclip
pytorchOfficial

Videos

MemeCLIP: Leveraging CLIP Representations for Multimodal Meme Classification· underline

Taxonomy

TopicsMisinformation and Its Impacts · Humor Studies and Applications · Hate Speech and Cyberbullying Detection

MethodsAttention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Label Smoothing · Byte Pair Encoding · Absolute Position Encodings · Softmax · Layer Normalization · Dropout · Dense Connections