How Much Training Data is Memorized in Overparameterized Autoencoders?   An Inverse Problem Perspective on Memorization Evaluation

Koren Abitbul; Yehuda Dar

arXiv:2310.02897·cs.LG·June 14, 2024·1 cites

How Much Training Data is Memorized in Overparameterized Autoencoders? An Inverse Problem Perspective on Memorization Evaluation

Koren Abitbul, Yehuda Dar

PDF

Open Access

TL;DR

This paper introduces an inverse problem approach to evaluate how much training data is memorized by overparameterized autoencoders, providing a practical method that outperforms previous techniques in recovering training images from degraded inputs.

Contribution

It formulates memorization as an inverse problem, developing a novel iterative optimization method that effectively recovers training data from autoencoders, even in challenging scenarios.

Findings

01

Method significantly outperforms previous memorization evaluation techniques.

02

Effective in recovering training images from highly degraded inputs.

03

Applicable across various autoencoder architectures and training conditions.

Abstract

Overparameterized autoencoder models often memorize their training data. For image data, memorization is often examined by using the trained autoencoder to recover missing regions in its training images (that were used only in their complete forms in the training). In this paper, we propose an inverse problem perspective for the study of memorization. Given a degraded training image, we define the recovery of the original training image as an inverse problem and formulate it as an optimization task. In our inverse problem, we use the trained autoencoder to implicitly define a regularizer for the particular training dataset that we aim to retrieve from. We develop the intricate optimization task into a practical method that iteratively applies the trained autoencoder and relatively simple computations that estimate and address the unknown degradation operator. We evaluate our method for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Neural Network Applications · Medical Image Segmentation Techniques

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Concatenated Skip Connection · Convolution · Max Pooling · U-Net · Inpainting