MULTI3NLU++: A Multilingual, Multi-Intent, Multi-Domain Dataset for Natural Language Understanding in Task-Oriented Dialogue
Nikita Moghe, Evgeniia Razumovskaia, Liane Guillou, Ivan Vuli\'c, Anna, Korhonen, Alexandra Birch

TL;DR
This paper introduces MULTI3NLU++, a comprehensive multilingual dataset for natural language understanding in task-oriented dialogue, covering multiple languages, domains, and user intents to improve system generalization.
Contribution
The creation of MULTI3NLU++, the first multilingual, multi-intent, multi-domain dataset for NLU in task-oriented dialogue, including translations into diverse resource languages.
Findings
Multilingual models struggle with low-resource languages.
The dataset reveals the complexity of multi-domain, multi-intent NLU tasks.
Benchmark results highlight room for improvement in multilingual TOD systems.
Abstract
Task-oriented dialogue (TOD) systems have been widely deployed in many industries as they deliver more efficient customer support. These systems are typically constructed for a single domain or language and do not generalise well beyond this. To support work on Natural Language Understanding (NLU) in TOD across multiple languages and domains simultaneously, we constructed MULTI3NLU++, a multilingual, multi-intent, multi-domain dataset. MULTI3NLU++ extends the English only NLU++ dataset to include manual translations into a range of high, medium, and low resource languages (Spanish, Marathi, Turkish and Amharic), in two domains (BANKING and HOTELS). Because of its multi-intent property, MULTI3NLU++ represents complex and natural user goals, and therefore allows us to measure the realistic performance of TOD systems in a varied set of the world's languages. We use MULTI3NLU++ to benchmark…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · Topic Modeling · Natural Language Processing Techniques
MethodsIs Venmo Customer Support Available 24/7? How to Reach a Real Person
