Can MLLMs Generalize to Multi-Party dialog? Exploring Multilingual Response Generation in Complex Scenarios
Zhongtian Hu, Yiwen Cui, Ronghan Li, Meng Zhao, Lifang Wang

TL;DR
This paper evaluates multilingual large language models' ability to handle complex multi-party dialogues, introduces a new dataset from podcasts, and finds that current models struggle to generalize, with limited improvements from fine-tuning.
Contribution
It introduces XMP, the first high-quality parallel dataset for multi-party multilingual dialogues, and systematically assesses LLMs' capabilities in this complex setting.
Findings
MLLMs fail to generalize to multi-party dialogues
Fine-tuning on XMP yields marginal improvements
Multilingual mixing during fine-tuning is generally detrimental
Abstract
Current multilingual large language models(MLLMs) still focus on simple question-answering formats, often overlooking more complex dialogue scenarios. In other words, their capabilities of multilingual large models have yet to be validated in dialogue tasks with intricate structures. We therefore ask, Q1: How well do LLMs generalize to more complex dialog scenarios? Q2: Can supervised fine-tuning on a high-quality parallel benchmark restore this ability? Q3: Does the "multilingual complementarity" effect survive in the setting? To answer these questions, we introduce XMP, a high-quality parallel Multilingual dataset sourced from Multi-party Podcast dialogues, which is the first parallel dataset focusing on multi-party dialogue scenarios. Most samples in the dataset feature three or more participants, discussing a wide range of topics. Through extensive experiments, we find that, R1:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems
MethodsFocus · Shrink and Fine-Tune
