Reject or Not?: A Benchmark for Voice Assistant Query Rejection in Smart Home Scenario and an Improved Method Based on LLMs

Huichao Men; Yizhen Hu; Yingyang He; Yu Gao; Xiaofeng Mou; Yi Xu

arXiv:2512.10257·cs.HC·December 15, 2025

Reject or Not?: A Benchmark for Voice Assistant Query Rejection in Smart Home Scenario and an Improved Method Based on LLMs

Huichao Men, Yizhen Hu, Yingyang He, Yu Gao, Xiaofeng Mou, Yi Xu

PDF

Open Access

TL;DR

This paper introduces a new benchmark and an improved LLM-based method for query rejection in Chinese smart-home voice assistants, enhancing rejection accuracy in complex, personalized scenarios.

Contribution

It provides the first Chinese-oriented open-source dataset and evaluation suite, along with a novel three-tier collaborative architecture for improved query rejection.

Findings

01

Significant improvement over baseline models in rejection accuracy

02

Effective handling of personalized and multi-turn scenarios

03

Established a reproducible benchmark for future research

Abstract

In smart-home voice assistant scenario, deciding whether to accept or reject a user query is the first step before any downstream processing. To address the limited query-rejection capability of current voice assistants, this paper presents the first Chinese-oriented open-source benchmark and evaluation suite for smart homes, together with a personalized query-rejection method based on large language models. On the data side, we construct the first multimodal query-rejection dataset tailored for domestic scenarios, containing 11,913 manually labeled text-speech pairs that systematically cover twelve typical dialogue types (e.g., chit-chat, non-human sounds, valid commands, ambiguous references, device-irrelevant requests). Fine-grained labels, conversational context and multi-turn information are provided to support both zero-shot and fine-tuning evaluations across language and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI in Service Interactions · Speech and dialogue systems · Speech Recognition and Synthesis