Detecting gross alignment errors in the Spoken British National Corpus

Ladan Baghai-Ravary; Sergio Grau; Greg Kochanski

arXiv:1101.1682·cs.SD·January 11, 2011·1 cites

Detecting gross alignment errors in the Spoken British National Corpus

Ladan Baghai-Ravary, Sergio Grau, Greg Kochanski

PDF

Open Access

TL;DR

This paper introduces methods to evaluate and identify gross alignment errors in large speech corpora, combining signal and label statistics to improve accuracy assessment and facilitate manual correction.

Contribution

It presents a novel hybrid approach for detecting significant alignment errors in large speech datasets, enhancing quality control in speech corpus annotation.

Findings

01

Good agreement with human ratings of alignment accuracy

02

Methods indicate likely locations of alignment problems

03

Facilitates efficient manual examination of large corpora

Abstract

The paper presents methods for evaluating the accuracy of alignments between transcriptions and audio recordings. The methods have been applied to the Spoken British National Corpus, which is an extensive and varied corpus of natural unscripted speech. Early results show good agreement with human ratings of alignment accuracy. The methods also provide an indication of the location of likely alignment problems; this should allow efficient manual examination of large corpora. Automatic checking of such alignments is crucial when analysing any very large corpus, since even the best current speech alignment systems will occasionally make serious errors. The methods described here use a hybrid approach based on statistics of the speech signal itself, statistics of the labels being evaluated, and statistics linking the two.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing