Reject or Not?: A Benchmark for Voice Assistant Query Rejection in Smart Home Scenario and an Improved Method Based on LLMs
Huichao Men, Yizhen Hu, Yingyang He, Yu Gao, Xiaofeng Mou, Yi Xu

TL;DR
This paper introduces a new benchmark and an improved LLM-based method for query rejection in Chinese smart-home voice assistants, enhancing rejection accuracy in complex, personalized scenarios.
Contribution
It provides the first Chinese-oriented open-source dataset and evaluation suite, along with a novel three-tier collaborative architecture for improved query rejection.
Findings
Significant improvement over baseline models in rejection accuracy
Effective handling of personalized and multi-turn scenarios
Established a reproducible benchmark for future research
Abstract
In smart-home voice assistant scenario, deciding whether to accept or reject a user query is the first step before any downstream processing. To address the limited query-rejection capability of current voice assistants, this paper presents the first Chinese-oriented open-source benchmark and evaluation suite for smart homes, together with a personalized query-rejection method based on large language models. On the data side, we construct the first multimodal query-rejection dataset tailored for domestic scenarios, containing 11,913 manually labeled text-speech pairs that systematically cover twelve typical dialogue types (e.g., chit-chat, non-human sounds, valid commands, ambiguous references, device-irrelevant requests). Fine-grained labels, conversational context and multi-turn information are provided to support both zero-shot and fine-tuning evaluations across language and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in Service Interactions · Speech and dialogue systems · Speech Recognition and Synthesis
