Matlab-based Epoch Extraction for Speaker Differentiation

Kunlun Li; Daniel Ferro; Xu Zhao; Abdul Jabbar Syed; Anil K Vuppala,; Azeemuddin Syed

arXiv:2407.18447·eess.AS·July 29, 2024

Matlab-based Epoch Extraction for Speaker Differentiation

Kunlun Li, Daniel Ferro, Xu Zhao, Abdul Jabbar Syed, Anil K Vuppala,, Azeemuddin Syed

PDF

Open Access

TL;DR

This paper presents a Matlab-based approach for extracting speech Epochs to differentiate speakers, evaluates methods like ZFF and ZFR, and demonstrates effective speaker identification with Raspberry Pi hardware implementation.

Contribution

It introduces a novel hardware implementation for Epoch extraction using Raspberry Pi and compares multiple methods for accuracy and stability in speaker differentiation.

Findings

01

Average Delta for same speaker: 7.5 epochs

02

Average Delta decreases significantly for different speakers

03

Raspberry Pi implementation is feasible for real-time speaker differentiation

Abstract

Epoch extraction has become increasingly popular in recent years for speech analysis research because accurately detecting the location of the Epoch is crucial for analyzing speech signals. The Epoch, occurring at the instant of excitation in the vocal tract system, particularly during glottal closure, plays a significant role in differentiating speakers in multi-speaker conversations. However, the extraction of the Epoch poses a challenge due to the time-varying factors in the vocal tract system, which makes deconvolution for obtaining the original excitation location more complex. In this paper, various methods for Epoch extraction, including Zero Frequency Filtering (ZFF) and Zero Frequency Resonator (ZFR), will be discussed, and their pros and cons evaluated. In addition, the stability, accuracy, and feasibility of each method will be compared. The evaluation will involve a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing