Multi-party Goal Tracking with LLMs: Comparing Pre-training, Fine-tuning, and Prompt Engineering
Angus Addlesee, Weronika Siei\'nska, Nancie Gunson, Daniel Hern\'andez, Garcia, Christian Dondrup, Oliver Lemon

TL;DR
This study assesses how well current Large Language Models can understand multi-party goal-oriented conversations, comparing fine-tuning, pre-training, and prompt engineering, with GPT-3.5-turbo showing notable effectiveness in limited-data scenarios.
Contribution
It introduces a novel multi-party conversation dataset and systematically compares different LLM approaches for goal tracking and intent recognition in MPCs.
Findings
GPT-3.5-turbo outperforms fine-tuned models in few-shot settings.
Reasoning prompts achieve the highest accuracy in goal and intent recognition.
Multi-party conversations remain challenging for current LLMs.
Abstract
This paper evaluates the extent to which current Large Language Models (LLMs) can capture task-oriented multi-party conversations (MPCs). We have recorded and transcribed 29 MPCs between patients, their companions, and a social robot in a hospital. We then annotated this corpus for multi-party goal-tracking and intent-slot recognition. People share goals, answer each other's goals, and provide other people's goals in MPCs - none of which occur in dyadic interactions. To understand user goals in MPCs, we compared three methods in zero-shot and few-shot settings: we fine-tuned T5, created pre-training tasks to train DialogLM using LED, and employed prompt engineering techniques with GPT-3.5-turbo, to determine which approach can complete this novel task with limited data. GPT-3.5-turbo significantly outperformed the others in a few-shot setting. The `reasoning' style prompt, when given 7%…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Text Readability and Simplification · Artificial Intelligence in Healthcare and Education
MethodsAttention Is All You Need · None · {Dispute@FaQ-s}How to file a dispute with Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Linear Layer · Layer Normalization · Refunds@Expedia|||How do I get a full refund from Expedia? · Cosine Annealing · Weight Decay · Linear Warmup With Cosine Annealing
