Who Are We Talking About? Handling Person Names in Speech Translation

Marco Gaido; Matteo Negri; Marco Turchi

arXiv:2205.06755·cs.CL·May 16, 2022

Who Are We Talking About? Handling Person Names in Speech Translation

Marco Gaido, Matteo Negri, Marco Turchi

PDF

Open Access 1 Repo

TL;DR

This paper identifies and addresses the poor handling of person names in speech translation systems, proposing multilingual models and joint transcription-translation training to significantly improve accuracy.

Contribution

It introduces a detailed analysis of person name errors and presents novel multilingual and joint training methods to enhance person name translation in speech translation systems.

Findings

01

47.8% relative improvement in person name accuracy

02

Analysis of nationality as a key factor in errors

03

Effective multilingual and joint training strategies

Abstract

Recent work has shown that systems for speech translation (ST) -- similarly to automatic speech recognition (ASR) -- poorly handle person names. This shortcoming does not only lead to errors that can seriously distort the meaning of the input, but also hinders the adoption of such systems in application scenarios (like computer-assisted interpreting) where the translation of named entities, like person names, is crucial. In this paper, we first analyse the outputs of ASR/ST systems to identify the reasons of failures in person name transcription/translation. Besides the frequency in the training data, we pinpoint the nationality of the referred person as a key factor. We then mitigate the problem by creating multilingual models, and further improve our ST systems by forcing them to jointly generate transcripts and translations, prioritising the former over the latter. Overall, our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hlt-mt/fbk-fairseq
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis · Speech and dialogue systems