Let's Talk! Striking Up Conversations via Conversational Visual Question   Generation

Shih-Han Chan; Tsai-Lun Yang; Yun-Wei Chu; Chi-Yang Hsu; Ting-Hao; Huang; Yu-Shian Chiu; Lun-Wei Ku

arXiv:2205.09327·cs.AI·May 20, 2022

Let's Talk! Striking Up Conversations via Conversational Visual Question Generation

Shih-Han Chan, Tsai-Lun Yang, Yun-Wei Chu, Chi-Yang Hsu, Ting-Hao, Huang, Yu-Shian Chiu, Lun-Wei Ku

PDF

Open Access

TL;DR

This paper presents a two-phase framework for generating engaging questions from a set of user photos to initiate conversations, outperforming existing models in producing response-provoking questions.

Contribution

It introduces a novel two-step approach combining visual storytelling with question generation to enhance conversation starters from images.

Findings

01

Generated questions are more engaging than baseline models.

02

Human evaluation favors the proposed framework.

03

The method effectively initiates conversations based on visual content.

Abstract

An engaging and provocative question can open up a great conversation. In this work, we explore a novel scenario: a conversation agent views a set of the user's photos (for example, from social media platforms) and asks an engaging question to initiate a conversation with the user. The existing vision-to-question models mostly generate tedious and obvious questions, which might not be ideals conversation starters. This paper introduces a two-phase framework that first generates a visual story for the photo set and then uses the story to produce an interesting question. The human evaluation shows that our framework generates more response-provoking questions for starting conversations than other vision-to-question baselines.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Video Analysis and Summarization · Human Motion and Animation