DFlow: Diverse Dialogue Flow Simulation with Large Language Models
Wanyu Du, Song Feng, James Gung, Lijia Sun, Yi Zhang, Saab Mansour,, Yanjun Qi

TL;DR
This paper introduces DFlow, a novel data simulation approach that enhances task logic diversity in dialogue datasets using large language models to generate decision tree-structured task plans, improving dialogue agent training.
Contribution
The paper presents a new method for generating diverse dialogue flows based on task execution logic, addressing a gap in existing data simulation techniques.
Findings
Generated 3,886 dialogue flows across 15 domains
Models trained on this data outperform baselines including GPT-4
Enhanced task logic diversity improves next action prediction accuracy
Abstract
Developing language model-based dialogue agents requires effective data to train models that can follow specific task logic. However, most existing data simulation methods focus on increasing diversity in language, topics, or dialogue acts at the utterance level, largely neglecting a critical aspect of task logic diversity at the dialogue level. This paper proposes a novel data simulation method designed to enhance the diversity of synthetic dialogues by focusing on task execution logic. Our method uses LLMs to generate decision tree-structured task plans, which enables the derivation of diverse dialogue trajectories for a given task. Each trajectory, referred to as a "dialog flow", guides the generation of a multi-turn dialogue that follows a unique trajectory. We apply this method to generate a task-oriented dialogue dataset comprising 3,886 dialogue flows across 15 different domains.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · Multi-Agent Systems and Negotiation
MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Residual Connection · Position-Wise Feed-Forward Layer · Dense Connections · Softmax · Multi-Head Attention · Adam · Dropout
