DDS: A new device-degraded speech dataset for speech enhancement

Haoyu Li; Junichi Yamagishi

arXiv:2109.07931·eess.AS·March 23, 2022

DDS: A new device-degraded speech dataset for speech enhancement

Haoyu Li, Junichi Yamagishi

PDF

Open Access

TL;DR

The paper introduces DDS, a comprehensive speech dataset with aligned high-quality and degraded recordings across diverse environments and devices, to advance research in speech enhancement.

Contribution

It provides a large, diverse, and well-annotated dataset for training and evaluating speech enhancement systems, addressing the gap in real-world degraded speech data.

Findings

01

Recording diversity significantly impacts SE performance

02

Baseline systems show varied results across conditions

03

DDS enables realistic evaluation of speech enhancement methods

Abstract

A large and growing amount of speech content in real-life scenarios is being recorded on consumer-grade devices in uncontrolled environments, resulting in degraded speech quality. Transforming such low-quality device-degraded speech into high-quality speech is a goal of speech enhancement (SE). This paper introduces a new speech dataset, DDS, to facilitate the research on SE. DDS provides aligned parallel recordings of high-quality speech (recorded in professional studios) and a number of versions of low-quality speech, producing approximately 2,000 hours speech data. The DDS dataset covers 27 realistic recording conditions by combining diverse acoustic environments and microphone devices, and each version of a condition consists of multiple recordings from six microphone positions to simulate different noise and reverberation levels. We also test several SE baseline systems on the DDS…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Music and Audio Processing

MethodsTest