Equal But Not The Same: Understanding the Implicit Relationship Between Persuasive Images and Text
Mingda Zhang, Rebecca Hwa, Adriana Kovashka

TL;DR
This paper investigates the complex, non-literal relationships between images and text in advertisements, introducing a new dataset and methods to better understand their implicit interactions beyond literal content matching.
Contribution
It presents a novel dataset of advertisement image-text pairs with annotated relationships and develops methods that outperform standard approaches in identifying parallel versus non-parallel relationships.
Findings
Our methods outperform standard image-text alignment approaches.
The dataset captures complex, non-literal relationships in advertisements.
Features analyzing creativity and ambiguity improve relationship prediction.
Abstract
Images and text in advertisements interact in complex, non-literal ways. The two channels are usually complementary, with each channel telling a different part of the story. Current approaches, such as image captioning methods, only examine literal, redundant relationships, where image and text show exactly the same content. To understand more complex relationships, we first collect a dataset of advertisement interpretations for whether the image and slogan in the same visual advertisement form a parallel (conveying the same message without literally saying the same thing) or non-parallel relationship, with the help of workers recruited on Amazon Mechanical Turk. We develop a variety of features that capture the creativity of images and the specificity or ambiguity of text, as well as methods that analyze the semantics within and across channels. We show that our method outperforms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Language, Metaphor, and Cognition · Video Analysis and Summarization
