Eigenvoice Synthesis based on Model Editing for Speaker Generation

Masato Murata; Koichi Miyazaki; Tomoki Koriyama; Tomoki Toda

arXiv:2507.03377·cs.SD·July 8, 2025

Eigenvoice Synthesis based on Model Editing for Speaker Generation

Masato Murata, Koichi Miyazaki, Tomoki Koriyama, Tomoki Toda

PDF

TL;DR

This paper introduces a novel DNN-based eigenvoice synthesis method that defines a speaker space within model parameters, enabling diverse speaker generation and attribute control without reference speech.

Contribution

It proposes a new approach to define speaker space in DNN parameters for speaker synthesis, extending traditional eigenvoice methods with model editing techniques.

Findings

01

Successfully generated diverse speaker voices.

02

Discovered a gender-dominant axis in the speaker space.

03

Demonstrated potential for attribute control in speaker synthesis.

Abstract

Speaker generation task aims to create unseen speaker voice without reference speech. The key to the task is defining a speaker space that represents diverse speakers to determine the generated speaker trait. However, the effective way to define this speaker space remains unclear. Eigenvoice synthesis is one of the promising approaches in the traditional parametric synthesis framework, such as HMM-based methods, which define a low-dimensional speaker space using pre-stored speaker features. This study proposes a novel DNN-based eigenvoice synthesis method via model editing. Unlike prior methods, our method defines a speaker space in the DNN model parameter space. By directly sampling new DNN model parameters in this space, we can create diverse speaker voices. Experimental results showed the capability of our method to generate diverse speakers' speech. Moreover, we discovered a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.