Monaural Multi-Talker Speech Recognition using Factorial Speech Processing Models
Mahdi Khademian, Mohammad Mehdi Homayounpour

TL;DR
This paper introduces a factorial speech processing model with joint decoding and neural network enhancements for monaural multi-talker speech recognition, surpassing previous super-human performance levels.
Contribution
It develops a novel joint-token passing algorithm for simultaneous decoding of target and masker speakers, improving over traditional two-phase methods.
Findings
Outperforms previous super-human speech recognition systems.
Achieves 5.5% absolute performance improvement over initial super-human models.
Attains 2.7% absolute improvement over recent deep learning-based competitors.
Abstract
A Pascal challenge entitled monaural multi-talker speech recognition was developed, targeting the problem of robust automatic speech recognition against speech like noises which significantly degrades the performance of automatic speech recognition systems. In this challenge, two competing speakers say a simple command simultaneously and the objective is to recognize speech of the target speaker. Surprisingly during the challenge, a team from IBM research, could achieve a performance better than human listeners on this task. The proposed method of the IBM team, consist of an intermediate speech separation and then a single-talker speech recognition. This paper reconsiders the task of this challenge based on gain adapted factorial speech processing models. It develops a joint-token passing algorithm for direct utterance decoding of both target and masker speakers, simultaneously.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Advanced Data Compression Techniques
