End-to-End Speech Translation for Code Switched Speech

Orion Weller; Matthias Sperber; Telmo Pires; Hendra Setiawan,; Christian Gollan; Dominic Telaar; Matthias Paulik

arXiv:2204.05076·cs.CL·April 12, 2022·1 cites

End-to-End Speech Translation for Code Switched Speech

Orion Weller, Matthias Sperber, Telmo Pires, Hendra Setiawan,, Christian Gollan, Dominic Telaar, Matthias Paulik

PDF

Open Access 2 Repos

TL;DR

This paper investigates end-to-end speech translation for code-switched English/Spanish speech, introducing a new corpus and comparing various architectures, with bidirectional end-to-end models showing strong performance even without CS training data.

Contribution

It presents a novel speech translation corpus for code-switched speech and evaluates multiple architectures, highlighting the effectiveness of bidirectional end-to-end models.

Findings

01

Bidirectional end-to-end models perform well on CS speech.

02

Models perform effectively without CS training data.

03

A new corpus for CS speech translation is introduced.

Abstract

Code switching (CS) refers to the phenomenon of interchangeably using words and phrases from different languages. CS can pose significant accuracy challenges to NLP, due to the often monolingual nature of the underlying systems. In this work, we focus on CS in the context of English/Spanish conversations for the task of speech translation (ST), generating and evaluating both transcript and translation. To evaluate model performance on this task, we create a novel ST corpus derived from existing public data sets. We explore various ST architectures across two dimensions: cascaded (transcribe then translate) vs end-to-end (jointly transcribe and translate) and unidirectional (source -> target) vs bidirectional (source <-> target). We show that our ST architectures, and especially our bidirectional end-to-end architecture, perform well on CS speech, even when no CS training data is used.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Speech and dialogue systems · Text Readability and Simplification