End-to-End Learning of Speech 2D Feature-Trajectory for Prosthetic Hands

Mohsen Jafarzadeh; Yonas Tadesse

arXiv:2009.10283·eess.AS·November 30, 2020

End-to-End Learning of Speech 2D Feature-Trajectory for Prosthetic Hands

Mohsen Jafarzadeh, Yonas Tadesse

PDF

1 Repo

TL;DR

This paper introduces a lightweight end-to-end CNN that directly maps speech 2D features to prosthetic hand trajectories, enabling real-time control on embedded GPGPU devices without intermediate speech-to-text conversion.

Contribution

It presents a novel end-to-end CNN approach for speech-to-trajectory mapping in prosthetic hands, bypassing traditional speech recognition steps and optimized for embedded GPGPU hardware.

Findings

01

Achieved a root-mean-square error of 0.119 in trajectory prediction.

02

The CNN runs in 20ms on NVIDIA Jetson TX2, enabling real-time control.

03

The method is compatible with various speech 2D features like spectrogram, MFCC, or PNCC.

Abstract

Speech is one of the most common forms of communication in humans. Speech commands are essential parts of multimodal controlling of prosthetic hands. In the past decades, researchers used automatic speech recognition systems for controlling prosthetic hands by using speech commands. Automatic speech recognition systems learn how to map human speech to text. Then, they used natural language processing or a look-up table to map the estimated text to a trajectory. However, the performance of conventional speech-controlled prosthetic hands is still unsatisfactory. Recent advancements in general-purpose graphics processing units (GPGPUs) enable intelligent devices to run deep neural networks in real-time. Thus, architectures of intelligent systems have rapidly transformed from the paradigm of composite subsystems optimization to the paradigm of end-to-end optimization. In this paper, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ROBOTICSENGINEER/End_to_End_Learning_of_Speech_2D_Feature_Trajectory
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.