EchoMark: Perceptual Acoustic Environment Transfer with Watermark-Embedded Room Impulse Response

Chenpei Huang; Lingfeng Yao; Kyu In Lee; Lan Emily Zhang; Xun Chen; Miao Pan

arXiv:2511.06458·cs.SD·April 1, 2026

EchoMark: Perceptual Acoustic Environment Transfer with Watermark-Embedded Room Impulse Response

Chenpei Huang, Lingfeng Yao, Kyu In Lee, Lan Emily Zhang, Xun Chen, Miao Pan

PDF

TL;DR

EchoMark is a deep learning framework that transfers acoustic environments with embedded watermarks, enabling high-quality room impulse response generation while ensuring watermark robustness for security.

Contribution

It introduces the first perceptual acoustic environment transfer method with watermark embedding, operating in the latent domain to handle variable RIR characteristics.

Findings

01

Achieves room acoustic parameter matching comparable to state-of-the-art methods.

02

Attains a high MOS score of 4.22 out of 5 for perceptual quality.

03

Watermark detection accuracy exceeds 99%, with BER below 0.3%.

Abstract

Acoustic Environment Matching (AEM) is the task of transferring clean audio into a target acoustic environment, enabling engaging applications such as audio dubbing and auditory immersive virtual reality (VR). Recovering similar room impulse response (RIR) directly from reverberant speech offers more accessible and flexible AEM solution. However, this capability also introduces vulnerabilities of arbitrary ``relocation" if misused by malicious user, such as facilitating advanced voice spoofing attacks or undermining the authenticity of recorded evidence. To address this issue, we propose EchoMark, the first deep learning-based AEM framework that generates perceptually similar RIRs with embedded watermark. Our design tackle the challenges posed by variable RIR characteristics, such as different durations and energy decays, by operating in the latent domain. By jointly optimizing the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.