Exploiting Single-Channel Speech for Multi-Channel End-to-End Speech Recognition: A Comparative Study
Keyu An, Ji Xiao, Zhijian Ou

TL;DR
This paper compares three methods for leveraging single-channel speech data to improve multi-channel end-to-end speech recognition, finding that data simulation yields the best performance but requires longer training.
Contribution
It provides a systematic comparison of back-end pre-training, data scheduling, and data simulation for multi-channel ASR using single-channel data, highlighting the effectiveness of data simulation.
Findings
Data simulation outperforms other methods in accuracy.
Data scheduling slightly outperforms back-end pre-training.
All methods improve multi-channel ASR performance.
Abstract
Recently, the end-to-end training approach for multi-channel ASR has shown its effectiveness, which usually consists of a beamforming front-end and a recognition back-end. However, the end-to-end training becomes more difficult due to the integration of multiple modules, particularly considering that multi-channel speech data recorded in real environments are limited in size. This raises the demand to exploit the single-channel data for multi-channel end-to-end ASR. In this paper, we systematically compare the performance of three schemes to exploit external single-channel data for multi-channel end-to-end ASR, namely back-end pre-training, data scheduling, and data simulation, under different settings such as the sizes of the single-channel data and the choices of the front-end. Extensive experiments on CHiME-4 and AISHELL-4 datasets demonstrate that while all three methods improve the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Acoustic Wave Resonator Technologies
