# Characterisation of speech diversity using self-organising maps

**Authors:** Tom A. F. Anderson, David M. W. Powers

arXiv: 1702.02092 · 2017-02-08

## TL;DR

This paper explores using self-organising maps, specifically Kohonen SOMs, to classify large unlabelled speech datasets with minimal phoneme annotations, focusing on Australian vowel pronunciation.

## Contribution

It introduces a semi-supervised method employing multilevel SOMs for evaluating pronunciation, advancing speaker classification and vowel analysis in speech data.

## Key findings

- Low phoneme error rates achieved
- Effective evaluation of pronunciation using multilevel SOMs
- Application to Australian vowel pronunciation

## Abstract

We report investigations into speaker classification of larger quantities of unlabelled speech data using small sets of manually phonemically annotated speech. The Kohonen speech typewriter is a semi-supervised method comprised of self-organising maps (SOMs) that achieves low phoneme error rates. A SOM is a 2D array of cells that learn vector representations of the data based on neighbourhoods. In this paper, we report a method to evaluate pronunciation using multilevel SOMs with /hVd/ single syllable utterances for the study of vowels, for Australian pronunciation.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1702.02092/full.md

## References

3 references — full list in the complete paper: https://tomesphere.com/paper/1702.02092/full.md

---
Source: https://tomesphere.com/paper/1702.02092