Developing a Multi-Platform Speech Recording System Toward Open Service   of Building Large-Scale Speech Corpora

Keita Ishizuka; Takashi Nose

arXiv:1912.09148·cs.HC·December 20, 2019

Developing a Multi-Platform Speech Recording System Toward Open Service of Building Large-Scale Speech Corpora

Keita Ishizuka, Takashi Nose

PDF

Open Access

TL;DR

This paper presents the development of a multi-platform, browser-based speech recording system designed to facilitate large-scale speech corpus collection through open, low-cost, and shared services for researchers and developers.

Contribution

It introduces a unified platform enabling both corpus builders and participants to use a common system, addressing the lack of shared services in previous speech data collection efforts.

Findings

01

System supports multi-platform browser-based recording

02

Enables low-cost, large-scale speech data collection

03

Facilitates shared access for researchers and participants

Abstract

This paper briefly reports our ongoing attempt at the development of a multi-platform browser-based speech recording system. We designed the system toward a service of providing open service of building large-scale speech corpora at a low-cost for any researchers and developers related to speech processing. The recent increase in the use of crowdsourcing services, e.g., Amazon Mechanical Turk, enable us to reduce the cost of collecting speakers in the web, and there have been many attempts to develop the automated speech collecting platforms or application that is designed for the use the crowdsourcing. However, one of the major problems in the previous studies and developments for the attempts is that most of the systems are not a form of common service of speech recording and corpus building, and each corpus builder is necessary to develop the system in their own environment including…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMobile Crowdsensing and Crowdsourcing · Speech and dialogue systems · Music and Audio Processing