SentiCap: Generating Image Descriptions with Sentiments
Alexander Mathews, Lexing Xie, Xuming He

TL;DR
This paper introduces SentiCap, a neural network model that generates emotionally styled image captions, effectively producing positive or negative sentiment descriptions with high accuracy using limited training data.
Contribution
It presents a novel switching recurrent neural network capable of generating sentiment-aware image captions with minimal sentiment-labeled training data.
Findings
84.6% of positive captions were as descriptive as factual ones
88% of positive captions had correct sentiment according to crowd workers
Model outperforms in common image captioning quality metrics
Abstract
The recent progress on image recognition and language modeling is making automatic description of image content a reality. However, stylized, non-factual aspects of the written description are missing from the current systems. One such style is descriptions with emotions, which is commonplace in everyday communication, and influences decision-making and interpersonal relationships. We design a system to describe an image with emotions, and present a model that automatically generates captions with positive or negative sentiments. We propose a novel switching recurrent neural network with word-level regularization, which is able to produce emotional image captions using only 2000+ training sentences containing sentiments. We evaluate the captions with different automatic and crowd-sourcing metrics. Our model compares favourably in common quality metrics for image captioning. In 84.6% of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Natural Language Processing Techniques
