Does Collaborative Human-LM Dialogue Generation Help Information Extraction from Human Dialogues?
Bo-Ru Lu, Nikita Haduong, Chia-Hsuan Lee, Zeqiu Wu, Hao Cheng, Paul, Koester, Jean Utke, Tao Yu, Noah A. Smith, Mari Ostendorf

TL;DR
This paper presents a human-in-the-loop dialogue generation framework that synthesizes realistic conversations to improve information extraction in private call center data, achieving significant performance gains.
Contribution
It introduces a novel dialogue synthesis method for privacy-sensitive domains and demonstrates its effectiveness in enhancing information extraction from real-world call center dialogues.
Findings
25% relative F1 improvement with synthetic data augmentation
Synthetic dialogues capture complex real-world call center interactions
Framework supports privacy-preserving data augmentation
Abstract
The capabilities of pretrained language models have opened opportunities to explore new application areas, but applications involving human-human interaction are limited by the fact that most data is protected from public release for privacy reasons. Problem-solving human dialogues in real applications can be much more complex than existing Wizard-of-Oz collections, preventing successful domain transfer. To support information extraction (IE) for a private call center dataset, we introduce a human-in-the-loop dialogue generation framework capable of synthesizing realistic dialogues. In IE experiments with auto insurance call center dialogues, we observe 25\% relative improvement in after augmenting a small set of real human conversations with synthetic data. We release code and our synthetic dataset to illustrate the complexity of real-world call center conversations and encourage…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · Natural Language Processing Techniques
