# Updates-Leak: Data Set Inference and Reconstruction Attacks in Online   Learning

**Authors:** Ahmed Salem, Apratim Bhattacharya, Michael Backes, Mario, Fritz, Yang Zhang

arXiv: 1904.01067 · 2019-12-03

## TL;DR

This paper reveals that changes in black-box ML model outputs over time can leak information about the data used for updates, introducing new privacy risks and proposing four deep learning-based attack methods.

## Contribution

The paper introduces four novel attacks using deep learning, including a hybrid GAN model, to infer dataset information from model output changes in online learning.

## Key findings

- Attacks effectively infer updating set information.
- Proposed CBM-GAN reconstructs accurate data samples.
- Model output changes leak significant dataset details.

## Abstract

Machine learning (ML) has progressed rapidly during the past decade and the major factor that drives such development is the unprecedented large-scale data. As data generation is a continuous process, this leads to ML model owners updating their models frequently with newly-collected data in an online learning scenario. In consequence, if an ML model is queried with the same set of data samples at two different points in time, it will provide different results.   In this paper, we investigate whether the change in the output of a black-box ML model before and after being updated can leak information of the dataset used to perform the update, namely the updating set. This constitutes a new attack surface against black-box ML models and such information leakage may compromise the intellectual property and data privacy of the ML model owner. We propose four attacks following an encoder-decoder formulation, which allows inferring diverse information of the updating set. Our new attacks are facilitated by state-of-the-art deep learning techniques. In particular, we propose a hybrid generative model (CBM-GAN) that is based on generative adversarial networks (GANs) but includes a reconstructive loss that allows reconstructing accurate samples. Our experiments show that the proposed attacks achieve strong performance.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.01067/full.md

## Figures

35 figures with captions in the complete paper: https://tomesphere.com/paper/1904.01067/full.md

## References

56 references — full list in the complete paper: https://tomesphere.com/paper/1904.01067/full.md

---
Source: https://tomesphere.com/paper/1904.01067