Enhancing the Preference Extractor in Multi-turn Dialogues: From Annotating Disasters to Accurate Preference Extraction
Cheng Wang, ziru Liu, Pengcheng Tang, Mingyu Zhang, Quanyu Dai, Yue Zhu

TL;DR
This paper introduces IterChat, a novel framework for generating high-quality dialogue datasets that improve preference extraction accuracy in multi-turn dialogues by reducing annotation errors and enhancing data diversity.
Contribution
The paper proposes a new data format and GPT-4 based data generation method to improve preference extraction in multi-turn dialogues, addressing annotation challenges and error propagation.
Findings
Enhanced preference extraction performance with the new data format.
28.4% higher annotator efficiency using the proposed method.
Superior results in fine-tuning and few-shot prompting scenarios.
Abstract
Identifying user preferences in dialogue systems is a pivotal aspect of providing satisfying services. Current research shows that using large language models (LLMs) to fine-tune a task-specific preference extractor yields excellent results in terms of accuracy and generalization. However, the primary challenge stems from the inherent difficulty in obtaining high-quality labeled multi-turn dialogue data. Accurately tracking user preference transitions across turns not only demands intensive domain expertise and contextual consistency maintenance for annotators (termed \textbf{``Annotating Disaster''}) but also complicates model training due to error propagation in sequential dependency learning. Inspired by the observation that multi-turn preference extraction can be decomposed into iterative executions of one-turn extraction processes. We propose a novel dialogue data generation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · Sentiment Analysis and Opinion Mining
