Remember the Past: Distilling Datasets into Addressable Memories for   Neural Networks

Zhiwei Deng; Olga Russakovsky

arXiv:2206.02916·cs.LG·November 22, 2022·22 cites

Remember the Past: Distilling Datasets into Addressable Memories for Neural Networks

Zhiwei Deng, Olga Russakovsky

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper introduces a dataset compression method using shared memory bases and flexible addressing, enabling efficient re-training and continual learning with improved accuracy and reduced data size.

Contribution

It presents a novel dataset distillation approach that uses shared bases and learned addressing functions, achieving higher compression and better performance in re-training and continual learning.

Findings

01

State-of-the-art results on dataset distillation benchmarks.

02

Significant accuracy improvements on CIFAR10 and CIFAR100.

03

Enhanced continual learning performance across multiple benchmarks.

Abstract

We propose an algorithm that compresses the critical information of a large dataset into compact addressable memories. These memories can then be recalled to quickly re-train a neural network and recover the performance (instead of storing and re-training on the full original dataset). Building upon the dataset distillation framework, we make a key observation that a shared common representation allows for more efficient and effective distillation. Concretely, we learn a set of bases (aka ``memories'') which are shared between classes and combined through learned flexible addressing functions to generate a diverse set of training examples. This leads to several benefits: 1) the size of compressed data does not necessarily grow linearly with the number of classes; 2) an overall higher compression rate with more effective distillation is achieved; and 3) more generalized queries are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Remember the Past: Distilling Datasets into Addressable Memories for Neural Networks· slideslive

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Geophysical Methods and Applications