Filtered Noise Shaping for Time Domain Room Impulse Response Estimation   From Reverberant Speech

Christian J. Steinmetz; Vamsi Krishna Ithapu; Paul Calamia

arXiv:2107.07503·eess.AS·July 16, 2021

Filtered Noise Shaping for Time Domain Room Impulse Response Estimation From Reverberant Speech

Christian J. Steinmetz, Vamsi Krishna Ithapu, Paul Calamia

PDF

Open Access 1 Repo

TL;DR

This paper introduces FiNS, a novel deep learning model that directly estimates time domain room impulse responses from reverberant speech, enabling realistic acoustic matching for audio applications.

Contribution

FiNS is the first domain-inspired network that models RIR as a sum of filtered noise, improving efficiency and perceptual accuracy over existing methods.

Findings

01

Accurately estimates RIR parameters like T60 and DRR

02

Synthesizes perceptually realistic room acoustics

03

Outperforms deep learning baselines in listening tests

Abstract

Deep learning approaches have emerged that aim to transform an audio signal so that it sounds as if it was recorded in the same room as a reference recording, with applications both in audio post-production and augmented reality. In this work, we propose FiNS, a Filtered Noise Shaping network that directly estimates the time domain room impulse response (RIR) from reverberant speech. Our domain-inspired architecture features a time domain encoder and a filtered noise shaping decoder that models the RIR as a summation of decaying filtered noise signals, along with direct sound and early reflection components. Previous methods for acoustic matching utilize either large models to transform audio to match the target room or predict parameters for algorithmic reverberators. Instead, blind estimation of the RIR enables efficient and realistic transformation with a single convolution. An…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kyungyunlee/fins
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Hearing Loss and Rehabilitation · Advanced Adaptive Filtering Techniques