Personalized Socially Assistive Robots With End-to-End Speech-Language Models For Well-Being Support

Mengxue Fu; Zhonghao Shi; Minyu Huang; Siqi Liu; Mina Kian; Yirui Song; and Maja J. Matari\'c

arXiv:2507.14412·cs.RO·July 22, 2025

Personalized Socially Assistive Robots With End-to-End Speech-Language Models For Well-Being Support

Mengxue Fu, Zhonghao Shi, Minyu Huang, Siqi Liu, Mina Kian, Yirui Song, and Maja J. Matari\'c

PDF

TL;DR

This study explores the use of integrated end-to-end speech-language models in socially assistive robots to enhance real-time, personalized, and empathetic well-being support, based on a small user study.

Contribution

It introduces an SLM-enabled SAR dialogue system and evaluates its usability, identifying key limitations and areas for future improvement in real-time interaction and personalization.

Findings

01

Participants perceived the robot as empathetic and natural in dialogue.

02

Identified limitations in nonverbal behavior variability and synchronization.

03

Verbal responses were seen as generic and repetitive.

Abstract

Socially assistive robots (SARs) have shown great potential for supplementing well-being support. However, prior studies have found that existing dialogue pipelines for SARs remain limited in real-time latency, back-channeling, and personalized speech dialogue. Toward addressing these limitations, we propose using integrated end-to-end speech-language models (SLMs) with SARs. This work 1) evaluated the usability of an SLM-enabled SAR dialogue system through a small user study, and 2) identified remaining limitations through study user feedback to inform future improvements. We conducted a small within-participant user study with university students (N = 11) whose results showed that participants perceived an SLM-enabled SAR system as capable of providing empathetic feedback, natural turn-taking, back-channeling, and adaptive responses. We also found that participants reported the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.