DESAMO: A Device for Elder-Friendly Smart Homes Powered by Embedded LLM with Audio Modality
Youngwon Choi, Donghyuk Jung, Hwayeon Kim

TL;DR
DESAMO is an elder-friendly smart home device that uses an embedded Audio LLM to understand raw audio directly, providing natural, private interactions and improved recognition of elderly users' speech and critical events.
Contribution
It introduces a novel on-device Audio LLM system that processes raw audio for elder-friendly interactions, overcoming limitations of traditional speech recognition pipelines.
Findings
Supports natural and private interactions for elderly users
Effectively detects critical events like falls or calls for help
Processes raw audio directly without relying on traditional ASR pipelines
Abstract
We present DESAMO, an on-device smart home system for elder-friendly use powered by Audio LLM, that supports natural and private interactions. While conventional voice assistants rely on ASR-based pipelines or ASR-LLM cascades, often struggling with the unclear speech common among elderly users and unable to handle non-speech audio, DESAMO leverages an Audio LLM to process raw audio input directly, enabling a robust understanding of user intent and critical events, such as falls or calls for help.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
