Precise Detection of Speech Endpoints Dynamically: A Wavelet Convolution   based approach

Tanmoy Roy; Tshilidzi Marwala; Snehashish Chakraverty

arXiv:1804.06159·eess.AS·September 26, 2018·Commun. Nonlinear Sci. Numer. Simul.

Precise Detection of Speech Endpoints Dynamically: A Wavelet Convolution based approach

Tanmoy Roy, Tshilidzi Marwala, Snehashish Chakraverty

PDF

TL;DR

This paper introduces WCSEPD, a wavelet convolution-based algorithm that accurately detects speech endpoints in noisy conditions without requiring labeled training data, improving over traditional energy and zero-crossing methods.

Contribution

The paper presents a novel wavelet convolution approach for speech endpoint detection that effectively handles non-speech artifacts without the need for labeled training data.

Findings

01

Accurately detects speech endpoints amidst non-speech artifacts

02

Does not require labeled training data

03

Outperforms traditional energy-based methods

Abstract

Precise detection of speech endpoints is an important factor which affects the performance of the systems where speech utterances need to be extracted from the speech signal such as Automatic Speech Recognition (ASR) system. Existing endpoint detection (EPD) methods mostly uses Short-Term Energy (STE), Zero-Crossing Rate (ZCR) based approaches and their variants. But STE and ZCR based EPD algorithms often fail in the presence of Non-speech Sound Artifacts (NSAs) produced by the speakers. Algorithms based on pattern recognition and classification techniques are also proposed but require labeled data for training. A new algorithm termed as Wavelet Convolution based Speech Endpoint Detection (WCSEPD) is proposed in this article to extract speech endpoints. WCSEPD decomposes the speech signal into high-frequency and low-frequency components using wavelet convolution and computes entropy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.