Laugh Betrays You? Learning Robust Speaker Representation From Speech   Containing Non-Verbal Fragments

Yuke Lin; Xiaoyi Qin; Huahua Cui; Zhenyi Zhu; Ming Li

arXiv:2210.16028·eess.AS·November 21, 2023

Laugh Betrays You? Learning Robust Speaker Representation From Speech Containing Non-Verbal Fragments

Yuke Lin, Xiaoyi Qin, Huahua Cui, Zhenyi Zhu, Ming Li

PDF

Open Access

TL;DR

This paper investigates speaker verification using speech with non-verbal laughter segments, proposing a novel Laughter-Splicing Network that improves performance in laughter-inclusive scenarios.

Contribution

It introduces a new approach with Laughter-Splicing Network for speaker verification involving laughter, enhancing robustness in non-verbal speech conditions.

Findings

01

20% relative improvement on Laughter-Laughter trials

02

22% relative improvement on Speech-Laughter trials

03

Maintains performance on neutral speech datasets

Abstract

The success of automatic speaker verification shows that discriminative speaker representations can be extracted from neutral speech. However, as a kind of non-verbal voice, laughter should also carry speaker information intuitively. Thus, this paper focuses on exploring speaker verification about utterances containing non-verbal laughter segments. We collect a set of clips with laughter components by conducting a laughter detection script on VoxCeleb and part of the CN-Celeb dataset. To further filter untrusted clips, probability scores are calculated by our binary laughter detection classifier, which is pre-trained by pure laughter and neutral speech. After that, based on the clips whose scores are over the threshold, we construct trials under two different evaluation scenarios: Laughter-Laughter (LL) and Speech-Laughter (SL). Then a novel method called Laughter-Splicing based Network…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHumor Studies and Applications · Sentiment Analysis and Opinion Mining · Speech Recognition and Synthesis

MethodsTest