Unsupervised Spoken Term Discovery on Untranscribed Speech

Man-Ling Sung

arXiv:2011.14060·eess.AS·December 1, 2020

Unsupervised Spoken Term Discovery on Untranscribed Speech

Man-Ling Sung

PDF

Open Access

TL;DR

This paper explores unsupervised spoken term discovery in untranscribed speech, using neural networks and pattern analysis to identify keywords and topics without prior linguistic knowledge.

Contribution

It introduces a novel unsupervised system combining acoustic segment modeling and pattern discovery, utilizing neural network features for zero-resource language processing.

Findings

01

Bottleneck features improve acoustic segment modeling.

02

System successfully discovers keywords aligning with transcriptions.

03

Keywords enable effective topic comparison using IR techniques.

Abstract

(Part of the abstract) In this thesis, we investigate the use of unsupervised spoken term discovery in tackling this problem. Unsupervised spoken term discovery aims to discover topic-related terminologies in a speech without knowing the phonetic properties of the language and content. It can be further divided into two parts: Acoustic segment modelling (ASM) and unsupervised pattern discovery. ASM learns the phonetic structures of zero-resource language audio with no phonetic knowledge available, generating self-derived "phonemes". The audio are labelled with these "phonemes" to obtain "phoneme" sequences. Unsupervised pattern discovery searches for repetitive patterns in the "phoneme" sequences. The discovered patterns can be grouped to determine the keywords of the audio. Multilingual neural network with bottleneck layer is used for feature extraction. Experiments show that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Text Analysis Techniques · Natural Language Processing Techniques · Music and Audio Processing