Unsupervised Word Discovery: Boundary Detection with Clustering vs.   Dynamic Programming

Simon Malan; Benjamin van Niekerk; Herman Kamper

arXiv:2409.14486·eess.AS·January 14, 2025

Unsupervised Word Discovery: Boundary Detection with Clustering vs. Dynamic Programming

Simon Malan, Benjamin van Niekerk, Herman Kamper

PDF

Open Access

TL;DR

This paper introduces a simple, fast method for segmenting unlabeled speech into word-like units by predicting boundaries through feature dissimilarity and clustering, achieving competitive results with existing dynamic programming approaches.

Contribution

The authors propose a boundary prediction and clustering approach that simplifies and accelerates unsupervised word discovery, matching state-of-the-art performance on multiple language benchmarks.

Findings

01

Comparable accuracy to ES-KMeans+ on ZeroSpeech benchmarks

02

Almost five times faster than previous dynamic programming methods

03

Effective boundary detection using feature dissimilarity

Abstract

We look at the long-standing problem of segmenting unlabeled speech into word-like segments and clustering these into a lexicon. Several previous methods use a scoring model coupled with dynamic programming to find an optimal segmentation. Here we propose a much simpler strategy: we predict word boundaries using the dissimilarity between adjacent self-supervised features, then we cluster the predicted segments to construct a lexicon. For a fair comparison, we update the older ES-KMeans dynamic programming method with better features and boundary constraints. On the five-language ZeroSpeech benchmarks, our simple approach gives similar state-of-the-art results compared to the new ES-KMeans+ method, while being almost five times faster. Project webpage: https://s-malan.github.io/prom-seg-clus.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFuzzy Logic and Control Systems · Semantic Web and Ontologies · Natural Language Processing Techniques