A Multimodal Memes Classification: A Survey and Open Research Issues
Tariq Habib Afridi, Aftab Alam, Muhammad Numan Khan, Jawad Khan,, Young-Koo Lee

TL;DR
This paper surveys multimodal memes classification, highlighting challenges, reviewing state-of-the-art solutions, and proposing a generalized framework, aiming to guide future research in automatic meme censorship and understanding.
Contribution
It provides the first comprehensive survey and a generalized framework for multimodal memes classification, identifying open issues and guiding future research directions.
Findings
State-of-the-art VL methods often fail on memes classification
Identified key challenges in multimodal memes understanding
Proposed a generalized framework for VL problems
Abstract
Memes are graphics and text overlapped so that together they present concepts that become dubious if one of them is absent. It is spread mostly on social media platforms, in the form of jokes, sarcasm, motivating, etc. After the success of BERT in Natural Language Processing (NLP), researchers inclined to Visual-Linguistic (VL) multimodal problems like memes classification, image captioning, Visual Question Answering (VQA), and many more. Unfortunately, many memes get uploaded each day on social media platforms that need automatic censoring to curb misinformation and hate. Recently, this issue has attracted the attention of researchers and practitioners. State-of-the-art methods that performed significantly on other VL dataset, tends to fail on memes classification. In this context, this work aims to conduct a comprehensive study on memes classification, generally on the VL multimodal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Softmax · Layer Normalization · Weight Decay · Dropout · Linear Warmup With Linear Decay · Dense Connections · Attention Dropout · WordPiece · Multi-Head Attention
