Let's Talk! Striking Up Conversations via Conversational Visual Question Generation
Shih-Han Chan, Tsai-Lun Yang, Yun-Wei Chu, Chi-Yang Hsu, Ting-Hao, Huang, Yu-Shian Chiu, Lun-Wei Ku

TL;DR
This paper presents a two-phase framework for generating engaging questions from a set of user photos to initiate conversations, outperforming existing models in producing response-provoking questions.
Contribution
It introduces a novel two-step approach combining visual storytelling with question generation to enhance conversation starters from images.
Findings
Generated questions are more engaging than baseline models.
Human evaluation favors the proposed framework.
The method effectively initiates conversations based on visual content.
Abstract
An engaging and provocative question can open up a great conversation. In this work, we explore a novel scenario: a conversation agent views a set of the user's photos (for example, from social media platforms) and asks an engaging question to initiate a conversation with the user. The existing vision-to-question models mostly generate tedious and obvious questions, which might not be ideals conversation starters. This paper introduces a two-phase framework that first generates a visual story for the photo set and then uses the story to produce an interesting question. The human evaluation shows that our framework generates more response-provoking questions for starting conversations than other vision-to-question baselines.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Video Analysis and Summarization · Human Motion and Animation
