Improving Cascaded Unsupervised Speech Translation with Denoising Back-translation
Yu-Kuan Fu, Liang-Hsuan Tseng, Jiatong Shi, Chen-An Li, Tsu-Yuan Hsu,, Shinji Watanabe, Hung-yi Lee

TL;DR
This paper introduces a fully unsupervised cascaded speech translation system that leverages denoising back-translation to improve translation quality without using any paired data, achieving competitive results on standard benchmarks.
Contribution
The paper proposes a novel denoising back-translation method to enhance unsupervised neural machine translation within cascaded speech translation systems, reducing error propagation.
Findings
Denoising back-translation increases BLEU scores by 0.7-0.9 across translation directions.
The unsupervised system performs comparably to some supervised methods on CoVoST 2 and CVSS datasets.
Simplified pipeline reduces inference latency and maintains translation quality.
Abstract
Most of the speech translation models heavily rely on parallel data, which is hard to collect especially for low-resource languages. To tackle this issue, we propose to build a cascaded speech translation system without leveraging any kind of paired data. We use fully unpaired data to train our unsupervised systems and evaluate our results on CoVoST 2 and CVSS. The results show that our work is comparable with some other early supervised methods in some language pairs. While cascaded systems always suffer from severe error propagation problems, we proposed denoising back-translation (DBT), a novel approach to building robust unsupervised neural machine translation (UNMT). DBT successfully increases the BLEU score by 0.7--0.9 in all three translation directions. Moreover, we simplified the pipeline of our cascaded system to reduce inference latency and conducted a comprehensive analysis…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis · Topic Modeling
