Game-Based Video-Context Dialogue

Ramakanth Pasunuru; Mohit Bansal

arXiv:1809.04560·cs.CL·October 18, 2018

Game-Based Video-Context Dialogue

Ramakanth Pasunuru, Mohit Bansal

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new multimodal dialogue dataset based on live soccer videos and chats, enabling the development of models that generate contextually relevant dialogue grounded in dynamic visual and conversational data.

Contribution

It presents a novel video-context, multi-speaker dialogue dataset and baseline models for visually-grounded dialogue in dynamic, real-world scenarios.

Findings

01

Models can generate relevant dialogue from video and chat context.

02

The dataset enables evaluation of multimodal dialogue systems.

03

Baseline models show promising results with room for improvement.

Abstract

Current dialogue systems focus more on textual and speech context knowledge and are usually based on two speakers. Some recent work has investigated static image-based dialogue. However, several real-world human interactions also involve dynamic visual context (similar to videos) as well as dialogue exchanges among multiple speakers. To move closer towards such multimodal conversational skills and visually-situated applications, we introduce a new video-context, many-speaker dialogue dataset based on live-broadcast soccer game videos and chats from Twitch.tv. This challenging testbed allows us to develop visually-grounded dialogue models that should generate relevant temporal and spatial event language from the live video, while also being relevant to the chat history. For strong baselines, we also present several discriminative and generative models, e.g., based on tridirectional…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ramakanth-pasunuru/video-dialogue
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Speech and dialogue systems