# Persian Vowel recognition with MFCC and ANN on PCVC speech dataset

**Authors:** Saber Malekzadeh, Mohammad Hossein Gholizadeh, Seyed Naser Razavi

arXiv: 1812.06953 · 2018-12-18

## TL;DR

This paper proposes a method for recognizing Persian consonant-vowel phonemes using MFCC features and an ANN classifier on the PCVC dataset, achieving measurable vowel recognition accuracy.

## Contribution

It introduces a new Persian speech dataset (PCVC) and applies MFCC and ANN techniques for phoneme recognition, which is a novel approach for Persian speech analysis.

## Key findings

- Achieved measurable vowel recognition accuracy.
- Demonstrated effectiveness of MFCC features with ANN.
- Provided a new dataset for Persian phoneme recognition.

## Abstract

In this paper a new method for recognition of consonant-vowel phonemes combination on a new Persian speech dataset titled as PCVC (Persian Consonant-Vowel Combination) is proposed which is used to recognize Persian phonemes. In PCVC dataset, there are 20 sets of audio samples from 10 speakers which are combinations of 23 consonant and 6 vowel phonemes of Persian language. In each sample, there is a combination of one vowel and one consonant. First, the consonant phoneme is pronounced and just after it, the vowel phoneme is pronounced. Each sound sample is a frame of 2 seconds of audio. In every 2 seconds, there is an average of 0.5 second speech and the rest is silence. In this paper, the proposed method is the implementations of the MFCC (Mel Frequency Cepstrum Coefficients) on every partitioned sound sample. Then, every train sample of MFCC vector is given to a multilayer perceptron feed-forward ANN (Artificial Neural Network) for training process. At the end, the test samples are examined on ANN model for phoneme recognition. After training and testing process, the results are presented in recognition of vowels. Then, the average percent of recognition for vowel phonemes are computed.

---
Source: https://tomesphere.com/paper/1812.06953