An Automatic Speech Recognition System for Bengali Language based on   Wav2Vec2 and Transfer Learning

Tushar Talukder Showrav

arXiv:2209.08119·eess.AS·September 21, 2022·1 cites

An Automatic Speech Recognition System for Bengali Language based on Wav2Vec2 and Transfer Learning

Tushar Talukder Showrav

PDF

Open Access 1 Repo

TL;DR

This paper presents a Bengali speech recognition system using Wav2Vec2 and transfer learning, achieving promising results with limited training data in a low-resource language context.

Contribution

It introduces a transfer learning-based end-to-end speech recognition approach tailored for Bengali, addressing data scarcity issues.

Findings

01

Achieved a Levenshtein Mean Distance score of 3.819 on test data

02

Effectively modeled Bengali speech with only 1000 training samples

03

Demonstrated potential for low-resource language ASR development

Abstract

An independent, automated method of decoding and transcribing oral speech is known as automatic speech recognition (ASR). A typical ASR system extracts feature from audio recordings or streams and run one or more algorithms to map the features to corresponding texts. Numerous of research has been done in the field of speech signal processing in recent years. When given adequate resources, both conventional ASR and emerging end-to-end (E2E) speech recognition have produced promising results. However, for low-resource languages like Bengali, the current state of ASR lags behind, although the low resource state does not reflect upon the fact that this language is spoken by over 500 million people all over the world. Despite its popularity, there aren't many diverse open-source datasets available, which makes it difficult to conduct research on Bengali speech recognition systems. This paper…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tushartalukder/bengali-asr-model-using-wav2vec2
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing

MethodsTest