Semantic Complexity in End-to-End Spoken Language Understanding
Joseph P. McKenna, Samridhi Choudhary, Michael Saxon, Grant P., Strimel, Athanasios Mouchtaris

TL;DR
This paper investigates how the semantic complexity of datasets affects the performance of end-to-end spoken language understanding models, emphasizing the importance of dataset complexity in evaluating model capabilities.
Contribution
The study introduces empirical measures of semantic complexity and demonstrates their correlation with STI model performance across different datasets.
Findings
Performance improves as semantic complexity decreases.
Low complexity datasets yield near-perfect results.
Contextualizing performance with complexity measures reveals model applicability.
Abstract
End-to-end spoken language understanding (SLU) models are a class of model architectures that predict semantics directly from speech. Because of their input and output types, we refer to them as speech-to-interpretation (STI) models. Previous works have successfully applied STI models to targeted use cases, such as recognizing home automation commands, however no study has yet addressed how these models generalize to broader use cases. In this work, we analyze the relationship between the performance of STI models and the difficulty of the use case to which they are applied. We introduce empirical measures of dataset semantic complexity to quantify the difficulty of the SLU tasks. We show that near-perfect performance metrics for STI models reported in the literature were obtained with datasets that have low semantic complexity values. We perform experiments where we vary the semantic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
