BiPhone: Modeling Inter Language Phonetic Influences in Text

Abhirut Gupta; Ananya B. Sai; Richard Sproat; Yuri Vasilevski; James; S. Ren; Ambarish Jash; Sukhdeep S. Sodhi; and Aravindan Raghuveer

arXiv:2307.03322·cs.CL·July 10, 2023

BiPhone: Modeling Inter Language Phonetic Influences in Text

Abhirut Gupta, Ananya B. Sai, Richard Sproat, Yuri Vasilevski, James, S. Ren, Ambarish Jash, Sukhdeep S. Sodhi, and Aravindan Raghuveer

PDF

Open Access

TL;DR

This paper introduces BiPhone, a model that simulates phonetic errors in L2 text influenced by L1, and demonstrates its impact on language understanding benchmarks, highlighting the need for phonetically robust models.

Contribution

The paper presents a novel method to generate L1-influenced phonetic errors in L2 text and introduces the FunGLUE benchmark for evaluating models on phonetically corrupted data.

Findings

01

BiPhone generates plausible phonetic corruptions with L1-specific variations.

02

Phonetic corruptions significantly degrade state-of-the-art language understanding models.

03

A new phoneme prediction pre-training task improves model robustness to phonetic noise.

Abstract

A large number of people are forced to use the Web in a language they have low literacy in due to technology asymmetries. Written text in the second language (L2) from such users often contains a large number of errors that are influenced by their native language (L1). We propose a method to mine phoneme confusions (sounds in L2 that an L1 speaker is likely to conflate) for pairs of L1 and L2. These confusions are then plugged into a generative model (Bi-Phone) for synthetically producing corrupted L2 text. Through human evaluations, we show that Bi-Phone generates plausible corruptions that differ across L1s and also have widespread coverage on the Web. We also corrupt the popular language understanding benchmark SuperGLUE with our technique (FunGLUE for Phonetically Noised GLUE) and show that SoTA language understating models perform poorly. We also introduce a new phoneme prediction…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and dialogue systems · Natural Language Processing Techniques