The Unsupervised Acquisition of a Lexicon from Continuous Speech

Carl de Marcken (MIT Artificial Intelligence Laboratory)

arXiv:cmp-lg/9512002·cmp-lg·February 3, 2008·61 cites

The Unsupervised Acquisition of a Lexicon from Continuous Speech

Carl de Marcken (MIT Artificial Intelligence Laboratory)

PDF

Open Access

TL;DR

This paper introduces an unsupervised algorithm that learns a natural-language lexicon directly from raw speech using an MDL framework, hierarchical language representation, and articulatory features, outperforming previous methods.

Contribution

It presents a novel unsupervised learning approach that effectively acquires lexicons from raw speech, overcoming limitations of prior grammar-induction techniques.

Findings

01

Successful lexicon acquisition from raw speech data

02

Improved segmentation performance over previous methods

03

High statistical efficiency in language modeling

Abstract

We present an unsupervised learning algorithm that acquires a natural-language lexicon from raw speech. The algorithm is based on the optimal encoding of symbol sequences in an MDL framework, and uses a hierarchical representation of language that overcomes many of the problems that have stymied previous grammar-induction procedures. The forward mapping from symbol sequences to the speech stream is modeled using features based on articulatory gestures. We present results on the acquisition of lexicons and language models from raw speech, text, and phonetic transcripts, and demonstrate that our algorithm compares very favorably to other reported results with respect to segmentation performance and statistical efficiency.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Speech and dialogue systems · Speech Recognition and Synthesis