Spoken Language Understanding on Unseen Tasks With In-Context Learning
Neeraj Agrawal, Sriram Ganapathy

TL;DR
This paper proposes a novel fine-tuning method for speech-text large language models that improves their zero-shot and few-shot performance on unseen spoken language understanding tasks without needing task-specific data annotations.
Contribution
It introduces a robust task-agnostic fine-tuning approach using randomized class labels, enhancing LLMs' ability to handle unseen SLU tasks.
Findings
Significant performance improvement on unseen SLU tasks.
Effective zero/few-shot learning without task-specific annotations.
Outperforms standard approaches in robustness and generalization.
Abstract
Spoken language understanding (SLU) tasks involve diverse skills that probe the information extraction, classification and/or generation capabilities of models. In this setting, task-specific training data may not always be available. While traditional task-specific SLU models are unable to cater to such requirements, the speech-text large language models (LLMs) offer a promising alternative with emergent abilities. However, out of-the-box, our evaluations indicate that the zero/few-shot performance of prominent open-source speech-text LLMs on SLU tasks are not up to the mark. In this paper, we introduce a novel approach to robust task-agnostic fine-tuning using randomized class labels. With this proposed fine-tuning, we illustrate that the performance of the speech-text LLMs on an unseen task is significantly improved over standard approaches. Critically, the proposed approach avoids…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
