Variational Learning for Unsupervised Knowledge Grounded Dialogs

Mayank Mishra; Dhiraj Madan; Gaurav Pandey; Danish Contractor

arXiv:2112.00653·cs.CL·August 16, 2022

Variational Learning for Unsupervised Knowledge Grounded Dialogs

Mayank Mishra, Dhiraj Madan, Gaurav Pandey, Danish Contractor

PDF

Open Access 1 Repo

TL;DR

This paper introduces a variational learning approach for unsupervised knowledge grounded dialogue systems, improving response generation by better modeling latent document variables without requiring exact document knowledge during training.

Contribution

It develops a novel variational training method that maximizes the ELBO for knowledge grounded dialogs, enabling more effective training on large, unstructured knowledge collections.

Findings

01

Posterior distribution improves training accuracy.

02

Efficient approximation of ELBO over large knowledge bases.

03

First application of variational training in open-scale knowledge grounded dialogs.

Abstract

Recent methods for knowledge grounded dialogs generate responses by incorporating information from an external textual document. These methods do not require the exact document to be known during training and rely on the use of a retrieval system to fetch relevant documents from a large index. The documents used to generate the responses are modeled as latent variables whose prior probabilities need to be estimated. Models such as RAG and REALM, marginalize the document probabilities over the documents retrieved from the index to define the log likelihood loss function which is optimized end-to-end. In this paper, we develop a variational approach to the above technique wherein, we instead maximize the Evidence Lower bound (ELBO). Using a collection of three publicly available open-conversation datasets, we demonstrate how the posterior distribution, that has information from the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mayank31398/VRAG
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems · Multimodal Machine Learning Applications

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Dropout · BART · Linear Warmup With Linear Decay · WordPiece · Layer Normalization