Automated speech tools for helping communities process restricted-access corpora for language revival efforts
Nay San, Martijn Bartelds, Tol\'ulop\'e \`Og\'unr\`em\'i, Alison, Mount, Ruben Thompson, Michael Higgins, Roy Barker, Jane Simpson, Dan, Jurafsky

TL;DR
This paper presents a privacy-preserving workflow combining VAD, SLI, and ASR to facilitate access and annotation of restricted archival recordings of endangered languages, reducing transcription time with minimal training data.
Contribution
It introduces a novel integrated speech processing workflow that enables efficient triaging and annotation of restricted-access multilingual speech recordings for language revival.
Findings
Workflow reduces transcription time by 20%.
Effective with as little as 10 utterances per language for training.
Works with minimal annotated data, as little as 39 seconds.
Abstract
Many archival recordings of speech from endangered languages remain unannotated and inaccessible to community members and language learning programs. One bottleneck is the time-intensive nature of annotation. An even narrower bottleneck occurs for recordings with access constraints, such as language that must be vetted or filtered by authorised community members before annotation can begin. We propose a privacy-preserving workflow to widen both bottlenecks for recordings where speech in the endangered language is intermixed with a more widely-used language such as English for meta-linguistic commentary and questions (e.g. What is the word for 'tree'?). We integrate voice activity detection (VAD), spoken language identification (SLI), and automatic speech recognition (ASR) to transcribe the metalinguistic content, which an authorised person can quickly scan to triage recordings that can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInterpreting and Communication in Healthcare · Speech Recognition and Synthesis · Natural Language Processing Techniques
