Regularized Loss Minimizers with Local Data Perturbation: Consistency   and Data Irrecoverability

Zitao Li; Jean Honorio

arXiv:1805.07645·cs.LG·July 7, 2021

Regularized Loss Minimizers with Local Data Perturbation: Consistency and Data Irrecoverability

Zitao Li, Jean Honorio

PDF

Open Access

TL;DR

This paper introduces the concept of data irrecoverability, demonstrating that certain regularized loss minimization methods with perturbed data can ensure generalization and prevent data recovery, with quantifiable convergence guarantees.

Contribution

It establishes a theoretical framework linking data perturbation, loss consistency, and data irrecoverability, expanding understanding beyond traditional privacy notions.

Findings

01

Perturbed data can achieve loss consistency with theoretical guarantees.

02

Data irrecoverability is a broader concept than data privacy.

03

Convergence rates with perturbed data are within a constant factor of original data.

Abstract

We introduce a new concept, data irrecoverability, and show that the well-studied concept of data privacy is sufficient but not necessary for data irrecoverability. We show that there are several regularized loss minimization problems that can use perturbed data with theoretical guarantees of generalization, i.e., loss consistency. Our results quantitatively connect the convergence rates of the learning problems to the impossibility for any adversary for recovering the original data from perturbed observations. In addition, we show several examples where the convergence rates with perturbed data only increase the convergence rates with original data within a constant factor related to the amount of perturbation, i.e., noise.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Distributed Sensor Networks and Detection Algorithms · Stochastic Gradient Optimization Techniques