CGoDial: A Large-Scale Benchmark for Chinese Goal-oriented Dialog Evaluation
Yinpei Dai, Wanwei He, Bowen Li, Yuchuan Wu, Zheng Cao, Zhongqi An,, Jian Sun, Yongbin Li

TL;DR
CGoDial is a comprehensive large-scale Chinese benchmark for goal-oriented dialog evaluation, covering multiple knowledge sources and realistic spoken scenarios to advance dialog system research.
Contribution
It introduces a new challenging Chinese benchmark with diverse datasets, real conversation data, and varied experimental settings for robust dialog system evaluation.
Findings
Benchmark covers 96,763 sessions and 574,949 turns across three knowledge types.
Includes real conversation data and spoken features to simulate real-world scenarios.
Experimental settings evaluate generalization, adaptability, and robustness of dialog models.
Abstract
Practical dialog systems need to deal with various knowledge sources, noisy user expressions, and the shortage of annotated data. To better solve the above problems, we propose CGoDial, new challenging and comprehensive Chinese benchmark for multi-domain Goal-oriented Dialog evaluation. It contains 96,763 dialog sessions and 574,949 dialog turns totally, covering three datasets with different knowledge sources: 1) a slot-based dialog (SBD) dataset with table-formed knowledge, 2) a flow-based dialog (FBD) dataset with tree-formed knowledge, and a retrieval-based dialog (RBD) dataset with candidate-formed knowledge. To bridge the gap between academic benchmarks and spoken dialog scenarios, we either collect data from real conversations or add spoken features to existing datasets via crowd-sourcing. The proposed experimental settings include the combinations of training with either the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · Natural Language Processing Techniques
MethodsTest
