Emotional Speech Synthesis for Companion Robot to Imitate Professional Caregiver Speech
Takeshi Homma, Qinghua Sun, Takuya Fujioka, Ryuta Takawaki, Eriko, Ankyu, Kenji Nagamatsu, Daichi Sugawara, Etsuko T. Harada

TL;DR
This paper presents a speech synthesis method enabling robots to imitate human emotional speech with less manual effort, aiming to influence elderly people's circadian rhythms through emotionally appropriate communication.
Contribution
It introduces an emotion-driven speech synthesis approach using automatic emotion recognition, reducing manual adjustments compared to previous methods.
Findings
Participants felt more active when hearing early morning speech
Participants felt calmer when hearing late-night speech
The method successfully influenced emotional states in elderly listeners
Abstract
When people try to influence others to do something, they subconsciously adjust their speech to include appropriate emotional information. In order for a robot to influence people in the same way, the robot should be able to imitate the range of human emotions when speaking. To achieve this, we propose a speech synthesis method for imitating the emotional states in human speech. In contrast to previous methods, the advantage of our method is that it requires less manual effort to adjust the emotion of the synthesized speech. Our synthesizer receives an emotion vector to characterize the emotion of synthesized speech. The vector is automatically obtained from human utterances by using a speech emotion recognizer. We evaluated our method in a scenario when a robot tries to regulate an elderly person's circadian rhythm by speaking to the person using appropriate emotional states. For the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSocial Robot Interaction and HRI · Emotion and Mood Recognition · AI in Service Interactions
