Large Vision-Language Models for Knowledge-Grounded Data Annotation of   Memes

Shiling Deng; Serge Belongie; Peter Ebert Christensen

arXiv:2501.13851·cs.LG·January 24, 2025

Large Vision-Language Models for Knowledge-Grounded Data Annotation of Memes

Shiling Deng, Serge Belongie, Peter Ebert Christensen

PDF

Open Access 1 Repo

TL;DR

This paper introduces a large meme dataset, an automated annotation pipeline using vision-language models, and a specialized CLIP model for meme-text retrieval to improve understanding and analysis of memes.

Contribution

It presents a new large-scale meme dataset, an automated annotation framework, and a fine-tuned CLIP model for meme-text retrieval, advancing meme analysis capabilities.

Findings

01

Created the CM50 meme dataset with 33,000 memes

02

Developed an automated annotation pipeline for memes

03

Enhanced meme-text retrieval with a specialized CLIP model

Abstract

Memes have emerged as a powerful form of communication, integrating visual and textual elements to convey humor, satire, and cultural messages. Existing research has focused primarily on aspects such as emotion classification, meme generation, propagation, interpretation, figurative language, and sociolinguistics, but has often overlooked deeper meme comprehension and meme-text retrieval. To address these gaps, this study introduces ClassicMemes-50-templates (CM50), a large-scale dataset consisting of over 33,000 memes, centered around 50 popular meme templates. We also present an automated knowledge-grounded annotation pipeline leveraging large vision-language models to produce high-quality image captions, meme captions, and literary device labels overcoming the labor intensive demands of manual annotation. Additionally, we propose a meme-text retrieval CLIP model (mtrCLIP) that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

seefreem/meme_text_retrieval_p1
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSentiment Analysis and Opinion Mining · Misinformation and Its Impacts

MethodsContrastive Language-Image Pre-training