MAC-SLU: Multi-Intent Automotive Cabin Spoken Language Understanding Benchmark

Yuezhang Peng; Chonghao Cai; Ziang Liu; Shuai Fan; Sheng Jiang; Hua Xu; Yuxin Liu; Qiguang Chen; Kele Xu; Yao Li; Sheng Wang; Libo Qin; Xie Chen

arXiv:2512.01603·cs.CL·December 2, 2025

MAC-SLU: Multi-Intent Automotive Cabin Spoken Language Understanding Benchmark

Yuezhang Peng, Chonghao Cai, Ziang Liu, Shuai Fan, Sheng Jiang, Hua Xu, Yuxin Liu, Qiguang Chen, Kele Xu, Yao Li, Sheng Wang, Libo Qin, Xie Chen

PDF

Open Access 1 Datasets

TL;DR

This paper introduces MAC-SLU, a challenging multi-intent spoken language understanding dataset for automotive cabins, and benchmarks various large language and audio models, revealing current limitations and potential of different approaches.

Contribution

The paper presents MAC-SLU, a new complex dataset for automotive SLU, and provides a comprehensive benchmark of LLMs and LALMs, highlighting their strengths and weaknesses.

Findings

01

LLMs perform well with in-context learning but lag behind supervised fine-tuning.

02

End-to-end LALMs match pipeline approaches and reduce error propagation.

03

The dataset increases SLU task difficulty with authentic multi-intent data.

Abstract

Spoken Language Understanding (SLU), which aims to extract user semantics to execute downstream tasks, is a crucial component of task-oriented dialog systems. Existing SLU datasets generally lack sufficient diversity and complexity, and there is an absence of a unified benchmark for the latest Large Language Models (LLMs) and Large Audio Language Models (LALMs). This work introduces MAC-SLU, a novel Multi-Intent Automotive Cabin Spoken Language Understanding Dataset, which increases the difficulty of the SLU task by incorporating authentic and complex multi-intent data. Based on MAC-SLU, we conducted a comprehensive benchmark of leading open-source LLMs and LALMs, covering methods like in-context learning, supervised fine-tuning (SFT), and end-to-end (E2E) and pipeline paradigms. Our experiments show that while LLMs and LALMs have the potential to complete SLU tasks through in-context…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Gatsby1984/MAC_SLU
dataset· 67 dl
67 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · Speech Recognition and Synthesis · Emotion and Mood Recognition