Updating the silent speech challenge benchmark with deep learning

Yan Ji; Licheng Liu; Hongcui Wang; Zhilei Liu; Zhibin Niu; Bruce Denby

arXiv:1709.06818·cs.CL·September 21, 2017·1 cites

Updating the silent speech challenge benchmark with deep learning

Yan Ji, Licheng Liu, Hongcui Wang, Zhilei Liu, Zhibin Niu, Bruce Denby

PDF

Open Access

TL;DR

This paper updates the Silent Speech Challenge benchmark with deep learning methods, achieving significantly improved accuracy and providing new feature extraction techniques and decoding scenarios, thereby advancing silent speech recognition research.

Contribution

The paper introduces a deep learning approach that greatly reduces Word Error Rate and expands the benchmark with new features and decoding scenarios, enhancing silent speech recognition evaluation.

Findings

01

Word Error Rate reduced to 6.4% from 17.4%

02

Auto-encoder features outperform original features at reduced dimensions

03

Updated archive includes both original and new features

Abstract

The 2010 Silent Speech Challenge benchmark is updated with new results obtained in a Deep Learning strategy, using the same input features and decoding strategy as in the original article. A Word Error Rate of 6.4% is obtained, compared to the published value of 17.4%. Additional results comparing new auto-encoder-based features with the original features at reduced dimensionality, as well as decoding scenarios on two different language models, are also presented. The Silent Speech Challenge archive has been updated to contain both the original and the new auto-encoder features, in addition to the original raw data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing