Artificial Neural Networks to Recognize Speakers Division from   Continuous Bengali Speech

Hasmot Ali; Md. Fahad Hossain; Md. Mehedi Hasan; Sheikh Abujar; Sheak; Rashed Haider Noori

arXiv:2404.15168·eess.AS·April 24, 2024·1 cites

Artificial Neural Networks to Recognize Speakers Division from Continuous Bengali Speech

Hasmot Ali, Md. Fahad Hossain, Md. Mehedi Hasan, Sheikh Abujar, Sheak, Rashed Haider Noori

PDF

Open Access

TL;DR

This paper presents a neural network-based method for identifying the geographical division of Bengali speakers using continuous speech, achieving over 85% accuracy with a large dataset.

Contribution

It introduces a speaker division recognition system for Bengali speech using MFCC features and neural networks, with a comprehensive dataset and preprocessing techniques.

Findings

01

Achieved 85.44% accuracy in speaker division classification.

02

Utilized MFCC and Delta features with neural networks for classification.

03

Processed over 45 hours of speech data from 633 speakers.

Abstract

Voice based applications are ruling over the era of automation because speech has a lot of factors that determine a speakers information as well as speech. Modern Automatic Speech Recognition (ASR) is a blessing in the field of Human-Computer Interaction (HCI) for efficient communication among humans and devices using Artificial Intelligence technology. Speech is one of the easiest mediums of communication because it has a lot of identical features for different speakers. Nowadays it is possible to determine speakers and their identity using their speech in terms of speaker recognition. In this paper, we presented a method that will provide a speakers geographical identity in a certain region using continuous Bengali speech. We consider eight different divisions of Bangladesh as the geographical region. We applied the Mel Frequency Cepstral Coefficient (MFCC) and Delta features on an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis