Confidence-based Ensembles of End-to-End Speech Recognition Models
Igor Gitman, Vitaly Lavrukhin, Aleksandr Laptev, Boris Ginsburg

TL;DR
This paper introduces confidence-based ensemble methods for end-to-end speech recognition models, effectively combining multiple models to improve performance across domains without extensive target data.
Contribution
It proposes a confidence-based ensemble approach that selects the most confident model output, outperforming traditional language identification and enabling effective model combination.
Findings
Confidence-based ensembles outperform language ID-based selection.
Combining base and adapted models improves accuracy on multiple datasets.
Effective with limited target data and various architectures.
Abstract
The number of end-to-end speech recognition models grows every year. These models are often adapted to new domains or languages resulting in a proliferation of expert systems that achieve great results on target data, while generally showing inferior performance outside of their domain of expertise. We explore combination of such experts via confidence-based ensembles: ensembles of models where only the output of the most-confident model is used. We assume that models' target data is not available except for a small validation set. We demonstrate effectiveness of our approach with two applications. First, we show that a confidence-based ensemble of 5 monolingual models outperforms a system where model selection is performed via a dedicated language identification block. Second, we demonstrate that it is possible to combine base and adapted models to achieve strong results on both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Topic Modeling · Natural Language Processing Techniques
MethodsBalanced Selection
