Haha-Pod: An Attempt for Laughter-based Non-Verbal Speaker Verification

Yuke Lin; Xiaoyi Qin; Ning Jiang; Guoqing Zhao; Ming Li

arXiv:2309.14109·eess.AS·October 10, 2023

Haha-Pod: An Attempt for Laughter-based Non-Verbal Speaker Verification

Yuke Lin, Xiaoyi Qin, Ning Jiang, Guoqing Zhao, Ming Li

PDF

Open Access 1 Repo

TL;DR

This paper investigates the potential of laughter as a non-verbal cue for speaker verification, introducing a new dataset and a two-stage teacher-student framework that enhances identification accuracy using laughter signals.

Contribution

It presents the Haha-Pod dataset of laughter clips and verbal speech, and proposes a novel Two-Stage Teacher-Student framework for laughter-based speaker verification.

Findings

01

Significant improvement in laughter-based speaker verification accuracy.

02

Minor performance degradation on traditional verbal speech verification.

03

Introduction of a new laughter dataset for speaker recognition research.

Abstract

It is widely acknowledged that discriminative representation for speaker verification can be extracted from verbal speech. However, how much speaker information that non-verbal vocalization carries is still a puzzle. This paper explores speaker verification based on the most ubiquitous form of non-verbal voice, laughter. First, we use a semi-automatic pipeline to collect a new Haha-Pod dataset from open-source podcast media. The dataset contains over 240 speakers' laughter clips with corresponding high-quality verbal speech. Second, we propose a Two-Stage Teacher-Student (2S-TS) framework to minimize the within-speaker embedding distance between verbal and non-verbal (laughter) signals. Considering Haha-Pod as a test set, two trials (S2L-Eval) are designed to verify the speaker's identity through laugh sounds. Experimental results demonstrate that our method can significantly improve…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nevermorelin/hahapod
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing