Enhancing Dialogue State Tracking Models through LLM-backed User-Agents Simulation
Cheng Niu, Xingguang Wang, Xuxin Cheng, Juntong Song, and Tong Zhang

TL;DR
This paper demonstrates that using GPT-4 to generate synthetic dialogue data for training improves dialogue state tracking models, reducing annotation costs and enabling quick adaptation to new domains.
Contribution
The study introduces a novel approach of leveraging LLMs to generate training data for DST, enhancing performance and adaptability with minimal real data.
Findings
Generated data improves DST model performance over real-data-only training.
Models trained on synthetic data adapt quickly to new domains.
Synthetic dialogues enable cost-effective and scalable DST training.
Abstract
Dialogue State Tracking (DST) is designed to monitor the evolving dialogue state in the conversations and plays a pivotal role in developing task-oriented dialogue systems. However, obtaining the annotated data for the DST task is usually a costly endeavor. In this paper, we focus on employing LLMs to generate dialogue data to reduce dialogue collection and annotation costs. Specifically, GPT-4 is used to simulate the user and agent interaction, generating thousands of dialogues annotated with DST labels. Then a two-stage fine-tuning on LLaMA 2 is performed on the generated data and the real data for the DST prediction. Experimental results on two public DST benchmarks show that with the generated dialogue data, our model performs better than the baseline trained solely on real data. In addition, our approach is also capable of adapting to the dynamic demands in real-world scenarios,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsSpeech and dialogue systems · Multi-Agent Systems and Negotiation · Service-Oriented Architecture and Web Services
MethodsAttention Is All You Need · Dynamic Sparse Training · Linear Layer · Byte Pair Encoding · Label Smoothing · Adam · Residual Connection · Position-Wise Feed-Forward Layer · Multi-Head Attention · Dropout
