TraceableSpeech: Towards Proactively Traceable Text-to-Speech with Watermarking
Junzuo Zhou, Jiangyan Yi, Tao Wang, Jianhua Tao, Ye Bai, and Chu Yuan Zhang, Yong Ren, Zhengqi Wen

TL;DR
TraceableSpeech is a novel TTS model that directly embeds watermarks into generated speech, enhancing imperceptibility, quality, and robustness against attacks, while supporting flexible and diverse speech durations.
Contribution
It introduces a direct watermarking TTS model with frame-wise imprinting, improving robustness and flexibility over existing post-processing watermarking methods.
Findings
Outperforms baseline in watermark imperceptibility and speech quality
Demonstrates robustness against resplicing attacks
Applicable to various speech durations
Abstract
Various threats posed by the progress in text-to-speech (TTS) have prompted the need to reliably trace synthesized speech. However, contemporary approaches to this task involve adding watermarks to the audio separately after generation, a process that hurts both speech quality and watermark imperceptibility. In addition, these approaches are limited in robustness and flexibility. To address these problems, we propose TraceableSpeech, a novel TTS model that directly generates watermarked speech, improving watermark imperceptibility and speech quality. Furthermore, We design the frame-wise imprinting and extraction of watermarks, achieving higher robustness against resplicing attacks and temporal flexibility in operation. Experimental results show that TraceableSpeech outperforms the strong baseline where VALL-E or HiFicodec individually uses WavMark in watermark imperceptibility, speech…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Steganography and Watermarking Techniques · Internet Traffic Analysis and Secure E-voting · User Authentication and Security Systems
