A Nonparametric Bayesian Approach for Spoken Term detection by Example   Query

Amir Hossein Harati Nejad Torbati; Joseph Picone

arXiv:1606.05967·cs.CL·June 21, 2016

A Nonparametric Bayesian Approach for Spoken Term detection by Example Query

Amir Hossein Harati Nejad Torbati, Joseph Picone

PDF

TL;DR

This paper introduces a nonparametric Bayesian method for discovering acoustic units in low-resource languages and applies it to spoken term detection, achieving competitive results on the TIMIT dataset.

Contribution

The paper presents a novel nonparametric Bayesian approach for acoustic unit discovery and a spoken term detection algorithm based on these units, suitable for low-resource languages.

Findings

01

Achieved P@N of 61.2% on TIMIT

02

EER of 13.95% on TIMIT

03

Improved EER by 5% over previous methods

Abstract

State of the art speech recognition systems use data-intensive context-dependent phonemes as acoustic units. However, these approaches do not translate well to low resourced languages where large amounts of training data is not available. For such languages, automatic discovery of acoustic units is critical. In this paper, we demonstrate the application of nonparametric Bayesian models to acoustic unit discovery. We show that the discovered units are correlated with phonemes and therefore are linguistically meaningful. We also present a spoken term detection (STD) by example query algorithm based on these automatically learned units. We show that our proposed system produces a P@N of 61.2% and an EER of 13.95% on the TIMIT dataset. The improvement in the EER is 5% while P@N is only slightly lower than the best reported system in the literature.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.