Cloud-based Automatic Speech Recognition Systems for Southeast Asian Languages
Lei Wang, Rong Tong, Cheung Chi Leung, Sunil Sivadas, Chongjia Ni, Bin, Ma

TL;DR
This paper discusses the development of cloud-based automatic speech recognition systems for Southeast Asian languages, focusing on resource collection strategies for Bahasa Indonesia and Thai amid resource limitations.
Contribution
It introduces resource collection strategies for building ASR systems for under-resourced Southeast Asian languages using cloud-based approaches.
Findings
Effective resource collection methods demonstrated for Bahasa Indonesia and Thai
Addressed challenges of limited speech and text data in regional languages
Proposed strategies improve ASR development for under-resourced languages
Abstract
This paper provides an overall introduction of our Automatic Speech Recognition (ASR) systems for Southeast Asian languages. As not much existing work has been carried out on such regional languages, a few difficulties should be addressed before building the systems: limitation on speech and text resources, lack of linguistic knowledge, etc. This work takes Bahasa Indonesia and Thai as examples to illustrate the strategies of collecting various resources required for building ASR systems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
