Dialogue Benchmark Generation from Knowledge Graphs with Cost-Effective Retrieval-Augmented LLMs
Reham Omar, Omij Mangukiya, Essam Mansour

TL;DR
This paper presents Chatty-Gen, a cost-effective, multi-stage retrieval-augmented system that automatically generates high-quality, domain-specific dialogue benchmarks from knowledge graphs, improving over existing methods in efficiency and consistency.
Contribution
Introduces Chatty-Gen, a novel multi-stage retrieval-augmented framework for automated dialogue benchmark generation from knowledge graphs, reducing reliance on costly LLMs and ensuring quality control.
Findings
Outperforms state-of-the-art systems in benchmark quality
Works effectively across diverse large knowledge graphs
Maintains performance across various LLMs like GPT-4o, Gemini 1.5, Llama 3, and Mistral
Abstract
Dialogue benchmarks are crucial in training and evaluating chatbots engaging in domain-specific conversations. Knowledge graphs (KGs) represent semantically rich and well-organized data spanning various domains, such as DBLP, DBpedia, and YAGO. Traditionally, dialogue benchmarks have been manually created from documents, neglecting the potential of KGs in automating this process. Some question-answering benchmarks are automatically generated using extensive preprocessing from KGs, but they do not support dialogue generation. This paper introduces Chatty-Gen, a novel multi-stage retrieval-augmented generation platform for automatically generating high-quality dialogue benchmarks tailored to a specific domain using a KG. Chatty-Gen decomposes the generation process into manageable stages and uses assertion rules for automatic validation between stages. Our approach enables control over…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · Natural Language Processing Techniques
MethodsLLaMA
