Summary Grounded Conversation Generation
Chulaka Gunasekara, Guy Feigenblat, Benjamin Sznajder, Sachindra, Joshi, David Konopnicki

TL;DR
This paper explores leveraging pre-trained language models to generate full conversations from summaries, aiming to improve data efficiency and quality in dialogue datasets through three novel approaches and dataset augmentation.
Contribution
It introduces three methods for generating grounded conversations from summaries and demonstrates how generated data can enhance summarization accuracy.
Findings
Generated conversations are comparable to human data in quality.
Augmenting datasets with generated conversations improves summarization performance.
Automatic and human evaluations validate the effectiveness of the approaches.
Abstract
Many conversation datasets have been constructed in the recent years using crowdsourcing. However, the data collection process can be time consuming and presents many challenges to ensure data quality. Since language generation has improved immensely in recent years with the advancement of pre-trained language models, we investigate how such models can be utilized to generate entire conversations, given only a summary of a conversation as the input. We explore three approaches to generate summary grounded conversations, and evaluate the generated conversations using automatic measures and human judgements. We also show that the accuracy of conversation summarization can be improved by augmenting a conversation summarization dataset with generated conversations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
