Articulatory Feature Prediction from Surface EMG during Speech Production
Jihwan Lee, Kevin Huang, Kleanthis Avramidis, Simon Pistrosch, Monica Gonzalez-Machorro, Yoonjeong Lee, Bj\"orn Schuller, Louis Goldstein, Shrikanth Narayanan

TL;DR
This paper introduces a novel model combining convolutional and Transformer layers to predict articulatory features from surface EMG signals, enabling the decoding of intelligible speech waveforms and optimizing electrode placement.
Contribution
The study presents the first method to decode speech waveforms directly from surface EMG via predicted articulatory features, integrating advanced neural network components.
Findings
Achieves approximately 0.9 correlation in articulatory feature prediction
Successfully decodes intelligible speech waveforms from EMG signals
Provides insights into electrode placement for improved predictability
Abstract
We present a model for predicting articulatory features from surface electromyography (EMG) signals during speech production. The proposed model integrates convolutional layers and a Transformer block, followed by separate predictors for articulatory features. Our approach achieves a high prediction correlation of approximately 0.9 for most articulatory features. Furthermore, we demonstrate that these predicted articulatory features can be decoded into intelligible speech waveforms. To our knowledge, this is the first method to decode speech waveforms from surface EMG via articulatory features, offering a novel approach to EMG-based speech synthesis. Additionally, we analyze the relationship between EMG electrode placement and articulatory feature predictability, providing knowledge-driven insights for optimizing EMG electrode configurations. The source code and decoded speech samples…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPhonetics and Phonology Research · Hand Gesture Recognition Systems · Hearing Impairment and Communication
MethodsAttention Is All You Need · Linear Layer · Byte Pair Encoding · Label Smoothing · Dropout · Adam · Multi-Head Attention · Dense Connections · Layer Normalization · Softmax
