Optimizing Rare Word Accuracy in Direct Speech Translation with a   Retrieval-and-Demonstration Approach

Siqi Li; Danni Liu; Jan Niehues

arXiv:2409.09009·cs.CL·October 2, 2024

Optimizing Rare Word Accuracy in Direct Speech Translation with a Retrieval-and-Demonstration Approach

Siqi Li, Danni Liu, Jan Niehues

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a retrieval-and-demonstration method to improve rare word translation accuracy in direct speech translation models by leveraging retrieved examples, significantly enhancing performance especially with gold examples.

Contribution

It proposes a novel retrieval-and-demonstration approach for direct speech translation that effectively incorporates retrieved examples to improve rare word translation accuracy.

Findings

01

Improved rare word translation accuracy by 17.6% with gold examples

02

Retrieval-based approach outperforms other modalities in robustness

03

Effective adaptation of standard models to leverage examples for rare words

Abstract

Direct speech translation (ST) models often struggle with rare words. Incorrect translation of these words can have severe consequences, impacting translation quality and user trust. While rare word translation is inherently challenging for neural models due to sparse learning signals, real-world scenarios often allow access to translations of past recordings on similar topics. To leverage these valuable resources, we propose a retrieval-and-demonstration approach to enhance rare word translation accuracy in direct ST models. First, we adapt existing ST models to incorporate retrieved examples for rare word translation, which allows the model to benefit from prepended examples, similar to in-context learning. We then develop a cross-modal (speech-to-speech, speech-to-text, text-to-text) retriever to locate suitable examples. We demonstrate that standard ST models can be effectively…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

siqilii/retrieve-and-demonstration-st
pytorchOfficial

Videos

Optimizing Rare Word Accuracy in Direct Speech Translation with a Retrieval-and-Demonstration Approach· underline

Taxonomy

TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis · Topic Modeling