Laughter Synthesis: Combining Seq2seq modeling with Transfer Learning

No\'e Tits; Kevin El Haddad; Thierry Dutoit

arXiv:2008.09483·eess.AS·August 24, 2020

Laughter Synthesis: Combining Seq2seq modeling with Transfer Learning

No\'e Tits, Kevin El Haddad, Thierry Dutoit

PDF

1 Repo

TL;DR

This paper introduces a novel laughter synthesis system using sequence-to-sequence TTS models and transfer learning, achieving higher naturalness than traditional methods and enabling speech synthesis with controllable amusement levels.

Contribution

It presents the first integration of laughter synthesis into a seq2seq TTS system using transfer learning, enhancing naturalness and control over emotional expressions.

Findings

01

The proposed model outperforms HMM-based laughter synthesis in perceived naturalness.

02

Transfer learning effectively enables joint speech and laughter generation.

03

The system is a step toward emotionally expressive speech synthesis with laughter control.

Abstract

Despite the growing interest for expressive speech synthesis, synthesis of nonverbal expressions is an under-explored area. In this paper we propose an audio laughter synthesis system based on a sequence-to-sequence TTS synthesis system. We leverage transfer learning by training a deep learning model to learn to generate both speech and laughs from annotations. We evaluate our model with a listening test, comparing its performance to an HMM-based laughter synthesis one and assess that it reaches higher perceived naturalness. Our solution is a first step towards a TTS system that would be able to synthesize speech with a control on amusement level with laughter integration.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

numediart/LaughterSynthesis
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.