StarDrinks: An English and Korean Test Set for SLU Evaluation in a Drink Ordering Scenario

Marcely Zanon Boito; Caroline Brun; Inyoung Kim; Denys Proux; Salah Ait-Mokhtar; Nikolaos Lagos; Jean-Luc Meunier; Ioan Calapodescu

arXiv:2604.26500·cs.CL·April 30, 2026

StarDrinks: An English and Korean Test Set for SLU Evaluation in a Drink Ordering Scenario

Marcely Zanon Boito, Caroline Brun, Inyoung Kim, Denys Proux, Salah Ait-Mokhtar, Nikolaos Lagos, Jean-Luc Meunier, Ioan Calapodescu

PDF

TL;DR

StarDrinks is a bilingual test set designed to evaluate speech and language understanding models in realistic drink ordering scenarios, capturing linguistic variability and spontaneous speech phenomena.

Contribution

We introduce StarDrinks, a novel multilingual dataset with annotated speech and transcriptions for SLU and NLU tasks in a complex, real-world drink ordering context.

Findings

01

Supports speech-to-slots, transcription-to-slots, and speech-to-transcription evaluations.

02

Captures diverse named entities, customizations, and spontaneous speech phenomena.

03

Provides a benchmark for model robustness and generalization in task-oriented dialogue.

Abstract

LLMs and speech assistants are increasingly used for task-oriented interactions, yet their evaluation often relies on controlled scenarios that fail to capture the variability and complexity of real user requests. Drink ordering, for example, involves diverse named entities, drink types, sizes, customizations, and brand-specific terminology, as well as spontaneous speech phenomena such as hesitations and self-corrections. To address this gap, we introduce StarDrinks, a test set in English and Korean containing speech utterances features, transcriptions, and annotated slots. Our dataset supports speech-to-slots SLU, transcription-to-slots NLU, and speech-to-transcription ASR evaluation, providing a realistic benchmark for model robustness and generalization in a linguistically rich, real-world task.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.