Activation Steering for Accent-Neutralized Zero-Shot Text-To-Speech

Mu Yang; John H. L. Hansen

arXiv:2603.05977·eess.AS·March 9, 2026

Activation Steering for Accent-Neutralized Zero-Shot Text-To-Speech

Mu Yang, John H. L. Hansen

PDF

Open Access

TL;DR

This paper presents a training-free, inference-time method using activation steering to neutralize accents in zero-shot TTS, effectively reducing accent while maintaining speaker identity and generalizing to unseen speakers.

Contribution

Introduces a novel post-hoc activation steering technique for accent neutralization in zero-shot TTS without additional training.

Findings

01

Effectively reduces accent in generated speech

02

Preserves speaker timbre accurately

03

Generalizes well to unseen accented speakers

Abstract

Zero-shot Text-to-Speech (TTS) models can generate speech that captures both the voice timbre and accent of a reference speaker. However, disentangling these attributes remains challenging, as the output often inherits both the accent and timbre from the reference. In this study, we introduce a novel, post-hoc, and training-free approach to neutralize accent while preserving the speaker's original timbre, utilizing inference-time activation steering. We first extract layer-specific "steering vectors" offline, which are derived from the internal activation differences within the TTS model between accented and native speech. During inference, the steering vectors are applied to guide the model to produce accent-neutralized, timbre-preserving speech. Empirical results demonstrate that the proposed steering vectors effectively mitigate the output accent and exhibit strong generalizability…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Voice and Speech Disorders · Phonetics and Phonology Research