Similar Scenes arouse Similar Emotions: Parallel Data Augmentation for Stylized Image Captioning
Guodun Li, Yuchen Zhai, Zehao Lin, Yin Zhang

TL;DR
This paper introduces a novel data augmentation framework for stylized image captioning that leverages similar scenes and emotion extraction to improve caption quality and stylistic consistency, addressing data scarcity issues.
Contribution
It proposes an Extract-Retrieve-Generate framework that enhances stylized captioning by grafting style phrases from similar scenes, significantly improving model performance.
Findings
Boosts caption relevance and style consistency
Alleviates data scarcity in stylized captioning
Outperforms state-of-the-art methods
Abstract
Stylized image captioning systems aim to generate a caption not only semantically related to a given image but also consistent with a given style description. One of the biggest challenges with this task is the lack of sufficient paired stylized data. Many studies focus on unsupervised approaches, without considering from the perspective of data augmentation. We begin with the observation that people may recall similar emotions when they are in similar scenes, and often express similar emotions with similar style phrases, which underpins our data augmentation idea. In this paper, we propose a novel Extract-Retrieve-Generate data augmentation framework to extract style phrases from small-scale stylized sentences and graft them to large-scale factual captions. First, we design the emotional signal extractor to extract style phrases from small-scale stylized sentences. Second, we construct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Video Analysis and Summarization · Advanced Image and Video Retrieval Techniques
