Koopman Autoencoders Learn Neural Representation Dynamics
Nishant Suresh Aswani, Saif Eddin Jabari

TL;DR
This paper introduces Koopman autoencoders that model neural network representation dynamics as linear systems, enabling interpretability and targeted unlearning of classes in neural networks.
Contribution
It presents a novel autoencoder-based method to learn linear surrogate models of neural representation dynamics, preserving topology and enabling manipulation.
Findings
Surrogate models replicate neural topological simplification.
Enables targeted class unlearning in neural networks.
Operates in a linear space for easier dynamics editing.
Abstract
This paper explores a simple question: can we model the internal transformations of a neural network using dynamical systems theory? We introduce Koopman autoencoders to capture how neural representations evolve through network layers, treating these representations as states in a dynamical system. Our approach learns a surrogate model that predicts how neural representations transform from input to output, with two key advantages. First, by way of lifting the original states via an autoencoder, it operates in a linear space, making editing the dynamics straightforward. Second, it preserves the topologies of the original representations by regularizing the autoencoding objective. We demonstrate that these surrogate models naturally replicate the progressive topological simplification observed in neural networks. As a practical application, we show how our approach enables targeted class…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Generative Adversarial Networks and Image Synthesis · Neural Networks and Reservoir Computing
