Summary Grounded Conversation Generation

Chulaka Gunasekara; Guy Feigenblat; Benjamin Sznajder; Sachindra; Joshi; David Konopnicki

arXiv:2106.03337·cs.CL·June 8, 2021

Summary Grounded Conversation Generation

Chulaka Gunasekara, Guy Feigenblat, Benjamin Sznajder, Sachindra, Joshi, David Konopnicki

PDF

TL;DR

This paper explores leveraging pre-trained language models to generate full conversations from summaries, aiming to improve data efficiency and quality in dialogue datasets through three novel approaches and dataset augmentation.

Contribution

It introduces three methods for generating grounded conversations from summaries and demonstrates how generated data can enhance summarization accuracy.

Findings

01

Generated conversations are comparable to human data in quality.

02

Augmenting datasets with generated conversations improves summarization performance.

03

Automatic and human evaluations validate the effectiveness of the approaches.

Abstract

Many conversation datasets have been constructed in the recent years using crowdsourcing. However, the data collection process can be time consuming and presents many challenges to ensure data quality. Since language generation has improved immensely in recent years with the advancement of pre-trained language models, we investigate how such models can be utilized to generate entire conversations, given only a summary of a conversation as the input. We explore three approaches to generate summary grounded conversations, and evaluate the generated conversations using automatic measures and human judgements. We also show that the accuracy of conversation summarization can be improved by augmenting a conversation summarization dataset with generated conversations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.