IPProtect: protecting the intellectual property of visual datasets   during data valuation

Gursimran Singh; Chendi Wang; Ahnaf Tazwar; Lanjun Wang; Yong Zhang

arXiv:2212.11468·cs.CV·December 23, 2022

IPProtect: protecting the intellectual property of visual datasets during data valuation

Gursimran Singh, Chendi Wang, Ahnaf Tazwar, Lanjun Wang, Yong Zhang

PDF

Open Access

TL;DR

This paper introduces IPProtect, a method to safeguard intellectual property in visual datasets during data valuation, balancing privacy and utility for machine learning tasks.

Contribution

It formalizes visual dataset IP risks and proposes a novel sanitization algorithm that protects IP while enabling accurate data valuation.

Findings

01

Effective dataset sanitization resisting IP violations

02

Maintains data utility for machine learning tasks

03

Outperforms baseline methods in experiments

Abstract

Data trading is essential to accelerate the development of data-driven machine learning pipelines. The central problem in data trading is to estimate the utility of a seller's dataset with respect to a given buyer's machine learning task, also known as data valuation. Typically, data valuation requires one or more participants to share their raw dataset with others, leading to potential risks of intellectual property (IP) violations. In this paper, we tackle the novel task of preemptively protecting the IP of datasets that need to be shared during data valuation. First, we identify and formalize two kinds of novel IP risks in visual datasets: data-item (image) IP and statistical (dataset) IP. Then, we propose a novel algorithm to convert the raw dataset into a sanitized version, that provides resistance to IP violations, while at the same time allowing accurate data valuation. The key…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Blockchain Technology Applications and Security · Retinal Imaging and Analysis