I Know You're Listening: Adaptive Voice for HRI

Paige Tutt\"os\'i

arXiv:2506.15107·cs.RO·July 31, 2025

I Know You're Listening: Adaptive Voice for HRI

Paige Tutt\"os\'i

PDF

Open Access

TL;DR

This paper introduces an expressive, adaptable, and clarity-optimized voice synthesis system for language teaching robots, enhancing expressivity, environmental responsiveness, and intelligibility for L2 learners.

Contribution

It presents a lightweight expressive TTS system, environmental adaptation techniques, and an L2 clarity mode tailored for language teaching robots, addressing key gaps in task-specific robot voices.

Findings

01

The expressive voice is more socially appropriate and suitable for storytelling.

02

Environmental adjustments improve perceived appropriateness and awareness.

03

The L2 clarity mode reduces transcription errors and improves intelligibility.

Abstract

While the use of social robots for language teaching has been explored, there remains limited work on a task-specific synthesized voices for language teaching robots. Given that language is a verbal task, this gap may have severe consequences for the effectiveness of robots for language teaching tasks. We address this lack of L2 teaching robot voices through three contributions: 1. We address the need for a lightweight and expressive robot voice. Using a fine-tuned version of Matcha-TTS, we use emoji prompting to create an expressive voice that shows a range of expressivity over time. The voice can run in real time with limited compute resources. Through case studies, we found this voice more expressive, socially appropriate, and suitable for long periods of expressive speech, such as storytelling. 2. We explore how to adapt a robot's voice to physical and social ambient environments to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems