Neural2Speech: A Transfer Learning Framework for Neural-Driven Speech   Reconstruction

Jiawei Li; Chunxu Guo; Li Fu; Lu Fan; Edward F. Chang; Yuanning Li

arXiv:2310.04644·cs.SD·February 1, 2024·2 cites

Neural2Speech: A Transfer Learning Framework for Neural-Driven Speech Reconstruction

Jiawei Li, Chunxu Guo, Li Fu, Lu Fan, Edward F. Chang, Yuanning Li

PDF

Open Access 1 Repo

TL;DR

Neural2Speech introduces a transfer learning framework that enables high-quality speech reconstruction from limited neural data by pre-training on speech corpora and adapting to neural recordings, advancing brain-computer interface communication.

Contribution

The paper presents a novel two-phase transfer learning approach that significantly improves speech reconstruction from small neural datasets compared to prior methods.

Findings

01

Effective reconstruction with only 20 minutes of neural data

02

Outperforms existing baseline methods in speech fidelity

03

Demonstrates feasibility of neural-driven speech synthesis

Abstract

Reconstructing natural speech from neural activity is vital for enabling direct communication via brain-computer interfaces. Previous efforts have explored the conversion of neural recordings into speech using complex deep neural network (DNN) models trained on extensive neural recording data, which is resource-intensive under regular clinical constraints. However, achieving satisfactory performance in reconstructing speech from limited-scale neural recordings has been challenging, mainly due to the complexity of speech representations and the neural data constraints. To overcome these challenges, we propose a novel transfer learning framework for neural-driven speech reconstruction, called Neural2Speech, which consists of two distinct training phases. First, a speech autoencoder is pre-trained on readily available speech corpora to decode speech waveforms from the encoded speech…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cctn-bci/neural2speech
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · EEG and Brain-Computer Interfaces