Synthetic Dialogue Dataset Generation using LLM Agents

Yelaman Abdullin; Diego Molla-Aliod; Bahadorreza Ofoghi; John; Yearwood; Qingyang Li

arXiv:2401.17461·cs.CL·February 1, 2024·1 cites

Synthetic Dialogue Dataset Generation using LLM Agents

Yelaman Abdullin, Diego Molla-Aliod, Bahadorreza Ofoghi, John, Yearwood, Qingyang Li

PDF

Open Access 1 Repo

TL;DR

This paper introduces a method to generate synthetic dialogues between LLM agents to facilitate the training of goal-oriented conversational agents for linear programming problem modeling, with evaluations showing promising quality.

Contribution

It presents a novel approach using prompt engineering to create and evaluate synthetic dialogues for training LP modeling agents, including human and GPT-4 assessments.

Findings

01

High-quality dialogues generated for LP problem description

02

Effective evaluation methods including GPT-4 based assessment

03

Available dataset and baseline conversational agent for research

Abstract

Linear programming (LP) problems are pervasive in real-life applications. However, despite their apparent simplicity, an untrained user may find it difficult to determine the linear model of their specific problem. We envisage the creation of a goal-oriented conversational agent that will engage in conversation with the user to elicit all information required so that a subsequent agent can generate the linear model. In this paper, we present an approach for the generation of sample dialogues that can be used to develop and train such a conversational agent. Using prompt engineering, we develop two agents that "talk" to each other, one acting as the conversational agent, and the other acting as the user. Using a set of text descriptions of linear problems from NL4Opt available to the user only, the agent and the user engage in conversation until the agent has retrieved all key…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

eabdullin/optimouse-quest
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · Topic Modeling · Service-Oriented Architecture and Web Services

MethodsAttention Is All You Need · Sparse Evolutionary Training · Linear Layer · Byte Pair Encoding · Residual Connection · Dropout · Layer Normalization · Multi-Head Attention · Adam · Softmax