Speech Watermarking with Discrete Intermediate Representations

Shengpeng Ji; Ziyue Jiang; Jialong Zuo; Minghui Fang; Yifu Chen; Tao; Jin; Zhou Zhao

arXiv:2412.13917·eess.AS·December 19, 2024

Speech Watermarking with Discrete Intermediate Representations

Shengpeng Ji, Ziyue Jiang, Jialong Zuo, Minghui Fang, Yifu Chen, Tao, Jin, Zhou Zhao

PDF

Open Access 1 Video

TL;DR

DiscreteWM is a novel speech watermarking framework that embeds watermarks into discrete intermediate representations of speech, achieving high robustness and imperceptibility, and capable of encoding up to 150 bits per second.

Contribution

It introduces a discrete latent space watermarking method using vector-quantized autoencoders and a token manipulation strategy for imperceptibility.

Findings

01

Achieves state-of-the-art robustness and imperceptibility

02

Can encode 1 to 150 bits of watermark per second

03

Effective for voice cloning detection and information hiding

Abstract

Speech watermarking techniques can proactively mitigate the potential harmful consequences of instant voice cloning techniques. These techniques involve the insertion of signals into speech that are imperceptible to humans but can be detected by algorithms. Previous approaches typically embed watermark messages into continuous space. However, intuitively, embedding watermark information into robust discrete latent space can significantly improve the robustness of watermarking systems. In this paper, we propose DiscreteWM, a novel speech watermarking framework that injects watermarks into the discrete intermediate representations of speech. Specifically, we map speech into discrete latent space with a vector-quantized autoencoder and inject watermarks by changing the modular arithmetic relation of discrete IDs. To ensure the imperceptibility of watermarks, we also propose a manipulator…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Speech Watermarking with Discrete Intermediate Representations· underline

Taxonomy

TopicsAdvanced Steganography and Watermarking Techniques · Advanced Data Compression Techniques · Music and Audio Processing