ChannelAugment: Improving generalization of multi-channel ASR by training with input channel randomization
Marco Gaudesi, Felix Weninger, Dushyant Sharma, Puming Zhan

TL;DR
This paper introduces ChannelAugment, a data augmentation method that improves multi-channel ASR robustness to different array configurations by randomly dropping input channels during training, leading to better generalization and efficiency.
Contribution
The paper proposes ChannelAugment, a simple channel dropout technique for training multi-channel ASR systems, enhancing their adaptability to various array geometries and reducing training time.
Findings
10.6% WER improvement across array configurations
74% reduction in training time for MVDR approach
Enhanced robustness to array variations
Abstract
End-to-end (E2E) multi-channel ASR systems show state-of-the-art performance in far-field ASR tasks by joint training of a multi-channel front-end along with the ASR model. The main limitation of such systems is that they are usually trained with data from a fixed array geometry, which can lead to degradation in accuracy when a different array is used in testing. This makes it challenging to deploy these systems in practice, as it is costly to retrain and deploy different models for various array configurations. To address this, we present a simple and effective data augmentation technique, which is based on randomly dropping channels in the multi-channel audio input during training, in order to improve the robustness to various array configurations at test time. We call this technique ChannelAugment, in contrast to SpecAugment (SA) which drops time and/or frequency components of a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Underwater Acoustics Research
MethodsTest
