MUSE: Model-Agnostic Tabular Watermarking via Multi-Sample Selection

Liancheng Fang; Aiwei Liu; Henry Peng Zou; Yankai Chen; Hengrui Zhang; Zhongfen Deng; Philip S. Yu

arXiv:2505.24267·cs.CR·June 2, 2025

MUSE: Model-Agnostic Tabular Watermarking via Multi-Sample Selection

Liancheng Fang, Aiwei Liu, Henry Peng Zou, Yankai Chen, Hengrui Zhang, Zhongfen Deng, Philip S. Yu

PDF

Open Access 3 Reviews

TL;DR

MUSE is a novel watermarking method for tabular generative models that uses multi-sample selection and scoring to embed watermarks without relying on invertibility, achieving high detectability and robustness.

Contribution

MUSE introduces a model-agnostic watermarking approach for tabular data that leverages multi-sample selection and theoretical analysis for calibration, outperforming existing methods.

Findings

01

Achieves 81-89% reduction in distortion rates on fidelity metrics.

02

Attains a 1.0 [email protected]%FPR detection rate.

03

Demonstrates robustness against various attacks.

Abstract

We introduce MUSE, a watermarking algorithm for tabular generative models. Previous approaches typically leverage DDIM invertibility to watermark tabular diffusion models, but tabular diffusion models exhibit significantly poorer invertibility compared to other modalities, compromising performance. Simultaneously, tabular diffusion models require substantially less computation than other modalities, enabling a multi-sample selection approach to tabular generative model watermarking. MUSE embeds watermarks by generating multiple candidate samples and selecting one based on a specialized scoring function, without relying on model invertibility. Our theoretical analysis establishes the relationship between watermark detectability, candidate count, and dataset size, allowing precise calibration of watermarking strength. Extensive experiments demonstrate that MUSE achieves state-of-the-art…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 6Confidence 4

Strengths

1. The approach departs from the inversion-based paradigm dominant in diffusion watermarking, meaning MUSE is compatible with any generative model that supports repeated sampling, including diffusion, autoregressive, and masked models. 2. Theoretical analysis on the detectability and distribution-preservation is provided. Though I didn't take a thorough look to the proof.

Weaknesses

1. The choice of pseudorandom function $f$ and hash function $H$ is abstracted away but critical to real-world detectability and security. The paper doesn't provide implementation sensitivity analyses, e.g., whether certain hash or key spaces degrade performance. 2. No ablation on hyperparameters $m$ beyond theory. While theorem 4.1 gives calibration, the experiments mostly fix $m=2$. There is no systematic analysis of how varying $m$ or $N$ affects trade-offs between computation, detectability,

Reviewer 02Rating 6Confidence 4

Strengths

(a) The watermarking approach is agnostic to the generative model and simply uses multiple samples and picks the highest scoring one, with the scoring function appropriately chosen. (b) Theoretical results are shown for detectability for a certain false positive rate of the watermarking approach and shown to be between 2 and 4 for a couple hundred [email protected]. (c) Watermarking both at a single (few) column-level and full set of columns are provided trading off robustness versus distortion tra

Weaknesses

(i) The quantile rank is proposed as a method to thwart adversaries but it is unclear if this can not be reverse engineered by adversaries. (ii) Similarly, the question of breaking the current approach for watermarking is not fully addressed in the paper. (iii) The complexity of having both categorical and continuous features in the dataset is not discussed in detail.

Reviewer 03Rating 2Confidence 5

Strengths

1. The paper is generally well written and easy to follow. 2. The method 's detectability and fidelity guarantee are supported by mathematical theorems (Theorem 4.1 and Theorem 4.3). 3. The method is also supported by experiments in real world dataset (Adult, Default, Shoppers and Beijing).

Weaknesses

1. The idea seems not novel. https://arxiv.org/abs/2410.02099 and https://arxiv.org/pdf/2403.04808 have almost the same idea as your work though they focus on watermarking large language models. 2. The paper does not consider additive noise attacks in its perturbation experiments. However, such attacks are an important robustness benchmark that has been widely considered in many prior works cited by the authors (https://dl.acm.org/doi/10.1145/3658644.3690373; https://openreview.net/forum?id=71

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Steganography and Watermarking Techniques · Digital Media Forensic Detection · Vehicle License Plate Recognition