Cross-lingual topic prediction for speech using translations

Sameer Bansal; Herman Kamper; Adam Lopez; Sharon Goldwater

arXiv:1908.11425·cs.CL·March 31, 2020

Cross-lingual topic prediction for speech using translations

Sameer Bansal, Herman Kamper, Adam Lopez, Sharon Goldwater

PDF

TL;DR

This paper presents a cross-lingual topic classification method for low-resource speech using limited translated data and speech-to-text translation models, achieving over 70% accuracy and aiding rapid crisis response.

Contribution

It introduces a novel approach that leverages small amounts of translated speech and speech-to-text models for effective topic classification in low-resource languages.

Findings

01

Achieves over 70% accuracy in classifying 1-minute speech segments.

02

Improves baseline accuracy by 20%.

03

Uses only 20 hours of translated speech for training.

Abstract

Given a large amount of unannotated speech in a low-resource language, can we classify the speech utterances by topic? We consider this question in the setting where a small amount of speech in the low-resource language is paired with text translations in a high-resource language. We develop an effective cross-lingual topic classifier by training on just 20 hours of translated speech, using a recent model for direct speech-to-text translation. While the translations are poor, they are still good enough to correctly classify the topic of 1-minute speech segments over 70% of the time - a 20% improvement over a majority-class baseline. Such a system could be useful for humanitarian applications like crisis response, where incoming speech in a foreign low-resource language must be quickly assessed for further action.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.