TL;DR
This paper presents a multimodal ensemble approach using transfer learning with BERT-based models to detect persuasive techniques in memes, achieving notable F1-scores across three subtasks involving text and images.
Contribution
It introduces a transfer learning framework and multimodal ensemble methods for detecting persuasive content in memes, advancing multimodal classification techniques.
Findings
Achieved F1-scores of 57.0, 48.2, and 52.1 in three subtasks.
Demonstrated the effectiveness of multimodal ensembles over single modality models.
Validated the approach on SemEval-2021 Task 6 dataset.
Abstract
Memes are one of the most popular types of content used to spread information online. They can influence a large number of people through rhetorical and psychological techniques. The task, Detection of Persuasion Techniques in Texts and Images, is to detect these persuasive techniques in memes. It consists of three subtasks: (A) Multi-label classification using textual content, (B) Multi-label classification and span identification using textual content, and (C) Multi-label classification using visual and textual content. In this paper, we propose a transfer learning approach to fine-tune BERT-based models in different modalities. We also explore the effectiveness of ensembles of models trained in different modalities. We achieve an F1-score of 57.0, 48.2, and 52.1 in the corresponding subtasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
