Comparison of sEMG Encoding Accuracy Across Speech Modes Using Articulatory and Phoneme Features
Chenqian Le, Ruisi Li, Beatrice Fumagalli, Yasamin Esmaeili, Xupeng Chen, Amirhossein Khalilian-Gourtani, Tianyu He, Adeen Flinker, Yao Wang

TL;DR
This study evaluates the effectiveness of SPARC features in predicting sEMG signals across different speech modes, demonstrating their robustness and interpretability for silent speech applications.
Contribution
It introduces the use of SPARC features with mTRF for sEMG prediction across speech modes, showing superior accuracy and interpretability over phoneme representations.
Findings
SPARC features outperform phoneme one-hot representations in sEMG prediction.
Subvocal speech shows above-chance prediction, indicating detectable articulatory activity.
mTRF weight patterns are consistent and anatomically interpretable across speech modes.
Abstract
We test whether Speech Articulatory Coding (SPARC) features can linearly predict surface electromyography (sEMG) envelopes across aloud, mimed, and subvocal speech in twenty-four subjects. Using elastic-net multivariate temporal response function (mTRF) with sentence-level cross-validation, SPARC yields higher prediction accuracy than phoneme one-hot representations on nearly all electrodes and in all speech modes. Aloud and mimed speech perform comparably, and subvocal speech remains above chance, indicating detectable articulatory activity. Variance partitioning shows a substantial unique contribution from SPARC and a minimal unique contribution from phoneme features. mTRF weight patterns reveal anatomically interpretable relationships between electrode sites and articulatory movements that remain consistent across modes. This study focuses on representation/encoding analysis (not…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
