Persian Vowel recognition with MFCC and ANN on PCVC speech dataset
Saber Malekzadeh, Mohammad Hossein Gholizadeh, Seyed Naser Razavi

TL;DR
This paper proposes a method for recognizing Persian consonant-vowel phonemes using MFCC features and an ANN classifier on the PCVC dataset, achieving measurable vowel recognition accuracy.
Contribution
It introduces a new Persian speech dataset (PCVC) and applies MFCC and ANN techniques for phoneme recognition, which is a novel approach for Persian speech analysis.
Findings
Achieved measurable vowel recognition accuracy.
Demonstrated effectiveness of MFCC features with ANN.
Provided a new dataset for Persian phoneme recognition.
Abstract
In this paper a new method for recognition of consonant-vowel phonemes combination on a new Persian speech dataset titled as PCVC (Persian Consonant-Vowel Combination) is proposed which is used to recognize Persian phonemes. In PCVC dataset, there are 20 sets of audio samples from 10 speakers which are combinations of 23 consonant and 6 vowel phonemes of Persian language. In each sample, there is a combination of one vowel and one consonant. First, the consonant phoneme is pronounced and just after it, the vowel phoneme is pronounced. Each sound sample is a frame of 2 seconds of audio. In every 2 seconds, there is an average of 0.5 second speech and the rest is silence. In this paper, the proposed method is the implementations of the MFCC (Mel Frequency Cepstrum Coefficients) on every partitioned sound sample. Then, every train sample of MFCC vector is given to a multilayer perceptron…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
