FedSpeech: Federated Text-to-Speech with Continual Learning

Ziyue Jiang; Yi Ren; Ming Lei; Zhou Zhao

arXiv:2110.07216·eess.AS·May 23, 2023

FedSpeech: Federated Text-to-Speech with Continual Learning

Ziyue Jiang, Yi Ren, Ming Lei, Zhou Zhao

PDF

TL;DR

FedSpeech introduces a federated text-to-speech system using continual learning techniques to preserve speaker identity and privacy, achieving high-quality multi-speaker synthesis with limited local data.

Contribution

The paper presents a novel federated learning architecture for text-to-speech that employs continual learning methods like gradual pruning and selective masks to protect speaker identity and improve performance.

Findings

01

Nearly matches multi-task training in speech quality

02

Retains speaker tones effectively

03

Outperforms multi-task training in speaker similarity

Abstract

Federated learning enables collaborative training of machine learning models under strict privacy restrictions and federated text-to-speech aims to synthesize natural speech of multiple users with a few audio training samples stored in their devices locally. However, federated text-to-speech faces several challenges: very few training samples from each speaker are available, training samples are all stored in local device of each user, and global model is vulnerable to various attacks. In this paper, we propose a novel federated learning architecture based on continual learning approaches to overcome the difficulties above. Specifically, 1) we use gradual pruning masks to isolate parameters for preserving speakers' tones; 2) we apply selective masks for effectively reusing knowledge from tasks; 3) a private speaker embedding is introduced to keep users' privacy. Experiments on a reduced…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsPruning