FROST-EMA: Finnish and Russian Oral Speech Dataset of Electromagnetic Articulography Measurements with L1, L2 and Imitated L2 Accents
Satu Hopponen, Tomi Kinnunen, Alexandre Nikolaev, Rosa Gonz\'alez Hautam\"aki, Lauri Tavi, Einar Meister

TL;DR
This paper introduces FROST-EMA, a novel bilingual speech dataset with electromagnetic articulography data, enabling research on language variability, accents, and speaker verification across Finnish and Russian speakers in native, second, and imitated accents.
Contribution
The paper presents a new bilingual electromagnetic articulography dataset with speech in L1, L2, and imitated L2, along with initial case studies demonstrating its research potential.
Findings
L2 and imitated L2 affect speaker verification performance
Articulatory patterns differ across L1, L2, and fake accents
Dataset supports phonetic and technological research
Abstract
We introduce a new FROST-EMA (Finnish and Russian Oral Speech Dataset of Electromagnetic Articulography) corpus. It consists of 18 bilingual speakers, who produced speech in their native language (L1), second language (L2), and imitated L2 (fake foreign accent). The new corpus enables research into language variability from phonetic and technological points of view. Accordingly, we include two preliminary case studies to demonstrate both perspectives. The first case study explores the impact of L2 and imitated L2 on the performance of an automatic speaker verification system, while the second illustrates the articulatory patterns of one speaker in L1, L2, and a fake accent.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPhonetics and Phonology Research · Emotion and Mood Recognition · Speech Recognition and Synthesis
