Towards flexible perception with visual memory
Robert Geirhos, Priyank Jaini, Austin Stone, Sourabh Medapati, Xi Yi, George Toderici, Abhijit Ogale, Jonathon Shlens

TL;DR
This paper introduces a flexible visual memory system that combines neural networks with database-like capabilities, enabling dynamic data addition, removal, and interpretability for image classification tasks.
Contribution
It proposes a novel visual memory framework that integrates deep embeddings with database operations, allowing flexible data management and interpretability in vision models.
Findings
Supports billion-scale data addition and class-level updates
Enables data removal through unlearning and pruning
Provides an interpretable decision mechanism
Abstract
Training a neural network is a monolithic endeavor, akin to carving knowledge into stone: once the process is completed, editing the knowledge in a network is hard, since all information is distributed across the network's weights. We here explore a simple, compelling alternative by marrying the representational power of deep neural networks with the flexibility of a database. Decomposing the task of image classification into image similarity (from a pre-trained embedding) and search (via fast nearest neighbor retrieval from a knowledge database), we build on well-established components to construct a simple and flexible visual memory that has the following key capabilities: (1.) The ability to flexibly add data across scales: from individual samples all the way to entire classes and billion-scale data; (2.) The ability to remove data through unlearning and memory pruning; (3.) An…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCCD and CMOS Imaging Sensors · 3D Surveying and Cultural Heritage · Visual Attention and Saliency Detection
