Vocalsound: A Dataset for Improving Human Vocal Sounds Recognition

Yuan Gong; Jin Yu; James Glass

arXiv:2205.03433·cs.SD·June 22, 2022

Vocalsound: A Dataset for Improving Human Vocal Sounds Recognition

Yuan Gong, Jin Yu, James Glass

PDF

1 Repo

TL;DR

This paper introduces VocalSound, a large and diverse dataset of human vocal sounds with rich metadata, significantly enhancing recognition accuracy and supporting research in health and speech applications.

Contribution

The creation of VocalSound, a comprehensive dataset with over 21,000 recordings and detailed metadata, addressing limitations of previous datasets for vocal sound recognition.

Findings

01

Adding VocalSound improves recognition accuracy by 41.9%.

02

The dataset's metadata enables demographic and health-related analysis.

03

VocalSound enhances robustness of vocal sound classification models.

Abstract

Recognizing human non-speech vocalizations is an important task and has broad applications such as automatic sound transcription and health condition monitoring. However, existing datasets have a relatively small number of vocal sound samples or noisy labels. As a consequence, state-of-the-art audio event classification models may not perform well in detecting human vocal sounds. To support research on building robust and accurate vocal sound recognition, we have created a VocalSound dataset consisting of over 21,000 crowdsourced recordings of laughter, sighs, coughs, throat clearing, sneezes, and sniffs from 3,365 unique subjects. Experiments show that the vocal sound recognition performance of a model can be significantly improved by 41.9% by adding VocalSound dataset to an existing dataset as training material. In addition, different from previous datasets, the VocalSound dataset…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

YuanGongND/vocalsound
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.