Slot Filling as a Reasoning Task for SpeechLLMs
Kadri Hacioglu, Manjunath K E, Andreas Stolcke

TL;DR
This paper introduces a reasoning-based approach to speechLLMs for slot-filling, utilizing chain-of-thought techniques to improve performance by decomposing tasks into reasoning steps and exploring hybrid models.
Contribution
It presents a novel reasoning framework for speechLLMs, including a reasoning dataset, and demonstrates the benefits of hybrid models that combine direct and reasoning modes.
Findings
Reasoning steps improve slot-filling performance.
Hybrid speechLLMs outperform single-mode models.
Reasoning LLMs from math/logical domains may be less effective.
Abstract
We propose integration of reasoning into speech large language models (speechLLMs) for the end-to-end slot-filling task. Inspired by the recent development of reasoning LLMs, we use a chain-of-thought framework to decompose the slot-filling task into multiple reasoning steps, create a reasoning dataset and apply the supervised fine-tuning strategy to a speechLLM. We distinguish between regular and reasoning speechLLMs and experiment with different types and sizes of LLMs as their text foundation models. We demonstrate performance improvements by introducing reasoning (intermediate) steps. However, we show that a reasoning textual LLM developed mainly for math, logic and coding domains might be inferior as a foundation model for a reasoning speechLLM. We further show that hybrid speechLLMs, built on a hybrid text foundation LLM and fine-tuned to preserve both direct and reasoning modes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
