HydraFormer: One Encoder For All Subsampling Rates

Yaoxun Xu; Xingchen Song; Zhiyong Wu; Di Wu; Zhendong Peng; Binbin; Zhang

arXiv:2408.04325·eess.AS·August 9, 2024

HydraFormer: One Encoder For All Subsampling Rates

Yaoxun Xu, Xingchen Song, Zhiyong Wu, Di Wu, Zhendong Peng, Binbin, Zhang

PDF

Open Access 1 Repo

TL;DR

HydraFormer introduces a unified model with multiple branches for different subsampling rates in speech recognition, reducing costs and maintaining high performance across diverse scenarios.

Contribution

It presents HydraFormer, a novel model that efficiently handles multiple subsampling rates within a single encoder, improving flexibility and reducing deployment costs.

Findings

01

Effective adaptation to various subsampling rates and languages.

02

Maintains high recognition accuracy across different settings.

03

Demonstrates stability and transferability from pretrained models.

Abstract

In automatic speech recognition, subsampling is essential for tackling diverse scenarios. However, the inadequacy of a single subsampling rate to address various real-world situations often necessitates training and deploying multiple models, consequently increasing associated costs. To address this issue, we propose HydraFormer, comprising HydraSub, a Conformer-based encoder, and a BiTransformer-based decoder. HydraSub encompasses multiple branches, each representing a distinct subsampling rate, allowing for the flexible selection of any branch during inference based on the specific use case. HydraFormer can efficiently manage different subsampling rates, significantly reducing training and deployment expenses. Experiments on AISHELL-1 and LibriSpeech datasets reveal that HydraFormer effectively adapts to various subsampling rates and languages while maintaining high recognition…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hydraformer/hydraformer
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Image Enhancement Techniques