A Knowledge-Grounded Multimodal Search-Based Conversational Agent

Shubham Agarwal; Ondrej Dusek; Ioannis Konstas; Verena Rieser

arXiv:1810.11954·cs.CL·November 22, 2018

A Knowledge-Grounded Multimodal Search-Based Conversational Agent

Shubham Agarwal, Ondrej Dusek, Ioannis Konstas, Verena Rieser

PDF

1 Repo

TL;DR

This paper presents a neural response generation model for multimodal search-based dialogue that incorporates external knowledge bases, significantly improving response quality over baselines.

Contribution

It introduces a knowledge-grounded multimodal conversational model that effectively integrates external knowledge into neural response generation.

Findings

01

Model outperforms baselines in BLEU scores by over 9 points.

02

Knowledge integration improves response relevance and quality.

03

Demonstrates effectiveness on the Multimodal Dialogue dataset.

Abstract

Multimodal search-based dialogue is a challenging new task: It extends visually grounded question answering systems into multi-turn conversations with access to an external database. We address this new challenge by learning a neural response generation system from the recently released Multimodal Dialogue (MMD) dataset (Saha et al., 2017). We introduce a knowledge-grounded multimodal conversational model where an encoded knowledge base (KB) representation is appended to the decoder input. Our model substantially outperforms strong baselines in terms of text-based similarity measures (over 9 BLEU points, 3 of which are solely due to the use of additional information from the KB.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shubhamagarwal92/mmd
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.