TraceableSpeech: Towards Proactively Traceable Text-to-Speech with   Watermarking

Junzuo Zhou; Jiangyan Yi; Tao Wang; Jianhua Tao; Ye Bai; and Chu Yuan Zhang; Yong Ren; Zhengqi Wen

arXiv:2406.04840·cs.SD·November 18, 2024

TraceableSpeech: Towards Proactively Traceable Text-to-Speech with Watermarking

Junzuo Zhou, Jiangyan Yi, Tao Wang, Jianhua Tao, Ye Bai, and Chu Yuan Zhang, Yong Ren, Zhengqi Wen

PDF

Open Access 1 Repo

TL;DR

TraceableSpeech is a novel TTS model that directly embeds watermarks into generated speech, enhancing imperceptibility, quality, and robustness against attacks, while supporting flexible and diverse speech durations.

Contribution

It introduces a direct watermarking TTS model with frame-wise imprinting, improving robustness and flexibility over existing post-processing watermarking methods.

Findings

01

Outperforms baseline in watermark imperceptibility and speech quality

02

Demonstrates robustness against resplicing attacks

03

Applicable to various speech durations

Abstract

Various threats posed by the progress in text-to-speech (TTS) have prompted the need to reliably trace synthesized speech. However, contemporary approaches to this task involve adding watermarks to the audio separately after generation, a process that hurts both speech quality and watermark imperceptibility. In addition, these approaches are limited in robustness and flexibility. To address these problems, we propose TraceableSpeech, a novel TTS model that directly generates watermarked speech, improving watermark imperceptibility and speech quality. Furthermore, We design the frame-wise imprinting and extraction of watermarks, achieving higher robustness against resplicing attacks and temporal flexibility in operation. Experimental results show that TraceableSpeech outperforms the strong baseline where VALL-E or HiFicodec individually uses WavMark in watermark imperceptibility, speech…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zjzser/traceablespeech
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Steganography and Watermarking Techniques · Internet Traffic Analysis and Secure E-voting · User Authentication and Security Systems