Hyper-Representations: Learning from Populations of Neural Networks
Konstantin Sch\"urholt

TL;DR
This paper introduces hyper-representations, a self-supervised approach to learn general representations of neural network weights, enabling better understanding, sampling, and transfer of models across architectures and tasks.
Contribution
It proposes hyper-representations as a novel method to learn task-agnostic, meaningful structures in neural network weights, facilitating interpretability and transfer learning.
Findings
Neural networks occupy meaningful structures in weight space.
Hyper-representations can predict model performance and training state.
They enable sampling of models with targeted properties.
Abstract
This thesis addresses the challenge of understanding Neural Networks through the lens of their most fundamental component: the weights, which encapsulate the learned information and determine the model behavior. At the core of this thesis is a fundamental question: Can we learn general, task-agnostic representations from populations of Neural Network models? The key contribution of this thesis to answer that question are hyper-representations, a self-supervised method to learn representations of NN weights. Work in this thesis finds that trained NN models indeed occupy meaningful structures in the weight space, that can be learned and used. Through extensive experiments, this thesis demonstrates that hyper-representations uncover model properties, such as their performance, state of training, or hyperparameters. Moreover, the identification of regions with specific properties in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
