ConvKGYarn: Spinning Configurable and Scalable Conversational Knowledge Graph QA datasets with Large Language Models
Ronak Pradeep, Daniel Lee, Ali Mousavi, Jeff Pound, Yisi Sang, Jimmy, Lin, Ihab Ilyas, Saloni Potdar, Mostafa Arefiyan, Yunyao Li

TL;DR
ConvKGYarn is a scalable, configurable method for generating up-to-date conversational Knowledge Graph QA datasets, enabling better training and evaluation of Large Language Models in dynamic, diverse interaction scenarios.
Contribution
It introduces a novel scalable approach for creating high-quality, configurable conversational KGQA datasets that adapt to evolving user information needs.
Findings
High-quality datasets comparable to existing ones
Effective testing of LLMs on diverse conversational configurations
Enhanced evaluation of LLMs' parametric knowledge
Abstract
The rapid advancement of Large Language Models (LLMs) and conversational assistants necessitates dynamic, scalable, and configurable conversational datasets for training and evaluation. These datasets must accommodate diverse user interaction modes, including text and voice, each presenting unique modeling challenges. Knowledge Graphs (KGs), with their structured and evolving nature, offer an ideal foundation for current and precise knowledge. Although human-curated KG-based conversational datasets exist, they struggle to keep pace with the rapidly changing user information needs. We present ConvKGYarn, a scalable method for generating up-to-date and configurable conversational KGQA datasets. Qualitative psychometric analyses confirm our method can generate high-quality datasets rivaling a popular conversational KGQA dataset while offering it at scale and covering a wide range of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Semantic Web and Ontologies · Natural Language Processing Techniques
