Complexity boosted adaptive training for better low resource ASR performance
Hongxuan Lu, Shenjian Wang, Biao Li

TL;DR
This paper introduces a two-stage adaptive training method for low-resource ASR that dynamically adjusts data augmentation and loss based on sample complexity, significantly improving recognition accuracy.
Contribution
It proposes the novel MinMax-IBF adaptive policy for dynamic training adjustments, enhancing model performance over traditional static methods.
Findings
Up to 13.4% relative WER reduction on LibriSpeech 100h test sets.
Up to 6.3% relative WER reduction on AISHELL-1.
Demonstrates the effectiveness of complexity-aware adaptive training.
Abstract
During the entire training process of the ASR model, the intensity of data augmentation and the approach of calculating training loss are applied in a regulated manner based on preset parameters. For example, SpecAugment employs a predefined strength of augmentation to mask parts of the time-frequency domain spectrum. Similarly, in CTC-based multi-layer models, the loss is generally determined based on the output of the encoder's final layer during the training process. However, ignoring dynamic characteristics may suboptimally train models. To address the issue, we present a two-stage training method, known as complexity-boosted adaptive (CBA) training. It involves making dynamic adjustments to data augmentation strategies and CTC loss propagation based on the complexity of the training samples. In the first stage, we train the model with intermediate-CTC-based regularization and data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems · Machine Learning and ELM · Distributed Sensor Networks and Detection Algorithms
