Neural Network-based Virtual Microphone Estimator

Tsubasa Ochiai; Marc Delcroix; Tomohiro Nakatani; Rintaro Ikeshita,; Keisuke Kinoshita; Shoko Araki

arXiv:2101.04315·eess.AS·January 13, 2021·1 cites

Neural Network-based Virtual Microphone Estimator

Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Rintaro Ikeshita,, Keisuke Kinoshita, Shoko Araki

PDF

Open Access

TL;DR

This paper introduces a neural network-based method for estimating virtual microphone signals directly from real multi-channel recordings, improving speech enhancement and recognition without relying on physical model assumptions.

Contribution

It proposes a fully supervised neural network approach for virtual microphone estimation that works directly on real recordings, bypassing traditional physical model limitations.

Findings

01

High estimation accuracy on real recordings

02

Improved speech enhancement performance

03

Enhanced speech recognition results

Abstract

Developing microphone array technologies for a small number of microphones is important due to the constraints of many devices. One direction to address this situation consists of virtually augmenting the number of microphone signals, e.g., based on several physical model assumptions. However, such assumptions are not necessarily met in realistic conditions. In this paper, as an alternative approach, we propose a neural network-based virtual microphone estimator (NN-VME). The NN-VME estimates virtual microphone signals directly in the time domain, by utilizing the precise estimation capability of the recent time-domain neural networks. We adopt a fully supervised learning framework that uses actual observations at the locations of the virtual microphones at training time. Consequently, the NN-VME can be trained using only multi-channel observations and thus directly on real recordings,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Music and Audio Processing · Advanced Adaptive Filtering Techniques