Customized Image Narrative Generation via Interactive Visual Question   Generation and Answering

Andrew Shin; Yoshitaka Ushiku; Tatsuya Harada

arXiv:1805.00460·cs.CL·May 2, 2018

Customized Image Narrative Generation via Interactive Visual Question Generation and Answering

Andrew Shin, Yoshitaka Ushiku, Tatsuya Harada

PDF

Open Access

TL;DR

This paper introduces a novel interactive method for generating customized image narratives by engaging users with visual questions and answers, capturing diverse perspectives and interests.

Contribution

It proposes a new interactive framework for image description that learns user interests over multiple stages, enabling personalized and diverse image narratives.

Findings

01

Generated descriptions cover a wider range of topics.

02

Model adapts to individual user interests.

03

Produces more diverse narratives than traditional methods.

Abstract

Image description task has been invariably examined in a static manner with qualitative presumptions held to be universally applicable, regardless of the scope or target of the description. In practice, however, different viewers may pay attention to different aspects of the image, and yield different descriptions or interpretations under various contexts. Such diversity in perspectives is difficult to derive with conventional image description techniques. In this paper, we propose a customized image narrative generation task, in which the users are interactively engaged in the generation process by providing answers to the questions. We further attempt to learn the user's interest via repeating such interactive stages, and to automatically reflect the interest in descriptions for new images. Experimental results demonstrate that our model can generate a variety of descriptions from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Video Analysis and Summarization