TS-RIR: Translated synthetic room impulse responses for speech   augmentation

Anton Ratnarajah; Zhenyu Tang; Dinesh Manocha

arXiv:2103.16804·cs.SD·November 15, 2021

TS-RIR: Translated synthetic room impulse responses for speech augmentation

Anton Ratnarajah, Zhenyu Tang, Dinesh Manocha

PDF

1 Repo

TL;DR

This paper introduces TS-RIRGAN, a novel method to enhance synthetic room impulse responses by translating them into more realistic versions, significantly improving far-field speech recognition accuracy.

Contribution

The paper presents TS-RIRGAN, a new architecture that translates synthetic RIRs into more realistic ones, bridging the fidelity gap for speech augmentation.

Findings

01

Improved synthetic RIRs lead to up to 19.9% WER reduction.

02

Translation with TS-RIRGAN enhances RIR realism.

03

Method benefits far-field speech recognition performance.

Abstract

We present a method for improving the quality of synthetic room impulse responses for far-field speech recognition. We bridge the gap between the fidelity of synthetic room impulse responses (RIRs) and the real room impulse responses using our novel, TS-RIRGAN architecture. Given a synthetic RIR in the form of raw audio, we use TS-RIRGAN to translate it into a real RIR. We also perform real-world sub-band room equalization on the translated synthetic RIR. Our overall approach improves the quality of synthetic RIRs by compensating low-frequency wave effects, similar to those in real RIRs. We evaluate the performance of improved synthetic RIRs on a far-field speech dataset augmented by convolving the LibriSpeech clean speech dataset [1] with RIRs and adding background noise. We show that far-field speech augmented using our improved synthetic RIRs reduces the word error rate by up to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

GAMMA-UMD/TS-RIR
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.