Repurposing and Evaluating the (In)Feasibility of Dataset Poisoning enabled Watermarking for Contrastive Learning
Zhiyang Dai, Yansong Gao, Boyu Kuang, Haodong Li, Qi Chang, Gaurav Varshney, Derek Abbott, Anmin Fu

TL;DR
This paper explores repurposing data-poisoning backdoor attacks as watermarks for protecting contrastive learning datasets, addressing robustness and verification challenges.
Contribution
It systematically evaluates backdoor attack limitations, proposes a statistical watermarking scheme, and demonstrates its effectiveness for dataset IP protection in contrastive learning.
Findings
Backdoor attacks show limited success and portability in contrastive learning.
Statistical divergence can distinguish trigger samples from clean data.
Proposed multi-level watermarking achieves trade-offs among fidelity, verifiability, and robustness.
Abstract
Contrastive learning (CL) reduces annotation cost via auto-derived supervisory signals. Since large-scale in-house CL datasets are infeasible, reliance on third-party or internet data is common. Recent studies show CL models are vulnerable to data-poisoning backdoor attacks, but their generalization and robustness are underexplored. We systematically evaluate existing data-poisoning backdoor attacks on CL, revealing limitations: poor dataset adaptability, low success rates, limited portability, and restrictive assumptions (e.g., downstream task knowledge). Interestingly, trigger samples exhibit distinguishable statistical divergence from clean samples, which inspires repurposing it as a watermark for dataset IP protection. Direct repurposing is challenging due to low success rates; we overcome this by statistical verification using a unified density metric. We further propose a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
