On-Device LLMs for Home Assistant: Dual Role in Intent Detection and   Response Generation

Rune Birkmose; Nathan M{\o}rkeberg Reece; Esben Hofstedt Norvin; and Johannes Bjerva; Mike Zhang

arXiv:2502.12923·cs.CL·March 24, 2025

On-Device LLMs for Home Assistant: Dual Role in Intent Detection and Response Generation

Rune Birkmose, Nathan M{\o}rkeberg Reece, Esben Hofstedt Norvin, and Johannes Bjerva, Mike Zhang

PDF

Open Access

TL;DR

This study demonstrates that fine-tuned, quantized LLMs can effectively perform intent detection and response generation for smart home assistants on resource-limited edge hardware, achieving high accuracy and semantic coherence.

Contribution

It shows that small, quantized LLMs can run on CPU-only devices for home automation tasks, unifying intent detection and response generation without specialized hardware.

Findings

01

Quantized LLMs maintain high accuracy in intent detection.

02

Models generalize well to noisy and out-of-domain prompts.

03

Inference time is acceptable for single commands.

Abstract

This paper investigates whether Large Language Models (LLMs), fine-tuned on synthetic but domain-representative data, can perform the twofold task of (i) slot and intent detection and (ii) natural language response generation for a smart home assistant, while running solely on resource-limited, CPU-only edge hardware. We fine-tune LLMs to produce both JSON action calls and text responses. Our experiments show that 16-bit and 8-bit quantized variants preserve high accuracy on slot and intent detection and maintain strong semantic coherence in generated text, while the 4-bit model, while retaining generative fluency, suffers a noticeable drop in device-service classification accuracy. Further evaluations on noisy human (non-synthetic) prompts and out-of-domain intents confirm the models' generalization ability, obtaining around 80--86\% accuracy. While the average inference time is 5--6…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · AI in Service Interactions · Speech and dialogue systems