ConvKGYarn: Spinning Configurable and Scalable Conversational Knowledge   Graph QA datasets with Large Language Models

Ronak Pradeep; Daniel Lee; Ali Mousavi; Jeff Pound; Yisi Sang; Jimmy; Lin; Ihab Ilyas; Saloni Potdar; Mostafa Arefiyan; Yunyao Li

arXiv:2408.05948·cs.CL·August 13, 2024

ConvKGYarn: Spinning Configurable and Scalable Conversational Knowledge Graph QA datasets with Large Language Models

Ronak Pradeep, Daniel Lee, Ali Mousavi, Jeff Pound, Yisi Sang, Jimmy, Lin, Ihab Ilyas, Saloni Potdar, Mostafa Arefiyan, Yunyao Li

PDF

Open Access 1 Video

TL;DR

ConvKGYarn is a scalable, configurable method for generating up-to-date conversational Knowledge Graph QA datasets, enabling better training and evaluation of Large Language Models in dynamic, diverse interaction scenarios.

Contribution

It introduces a novel scalable approach for creating high-quality, configurable conversational KGQA datasets that adapt to evolving user information needs.

Findings

01

High-quality datasets comparable to existing ones

02

Effective testing of LLMs on diverse conversational configurations

03

Enhanced evaluation of LLMs' parametric knowledge

Abstract

The rapid advancement of Large Language Models (LLMs) and conversational assistants necessitates dynamic, scalable, and configurable conversational datasets for training and evaluation. These datasets must accommodate diverse user interaction modes, including text and voice, each presenting unique modeling challenges. Knowledge Graphs (KGs), with their structured and evolving nature, offer an ideal foundation for current and precise knowledge. Although human-curated KG-based conversational datasets exist, they struggle to keep pace with the rapidly changing user information needs. We present ConvKGYarn, a scalable method for generating up-to-date and configurable conversational KGQA datasets. Qualitative psychometric analyses confirm our method can generate high-quality datasets rivaling a popular conversational KGQA dataset while offering it at scale and covering a wide range of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

ConvKGYarn: Spinning Configurable and Scalable Conversational Knowledge Graph QA datasets with Large Language Models· underline

Taxonomy

TopicsTopic Modeling · Semantic Web and Ontologies · Natural Language Processing Techniques