Learning Useful Representations of Recurrent Neural Network Weight   Matrices

Vincent Herrmann; Francesco Faccio; J\"urgen Schmidhuber

arXiv:2403.11998·cs.LG·April 30, 2025·1 cites

Learning Useful Representations of Recurrent Neural Network Weight Matrices

Vincent Herrmann, Francesco Faccio, J\"urgen Schmidhuber

PDF

Open Access 1 Repo

TL;DR

This paper introduces novel methods for representing RNN weights to improve analysis and downstream task performance, demonstrating the superiority of functionalist approaches in predicting trained tasks.

Contribution

It develops new functionalist approaches for RNN weight representation, provides theoretical insights, and releases datasets for evaluating RNN weight encodings.

Findings

01

Functionalist approaches outperform mechanistic ones in task prediction.

02

New datasets enable RNN weight representation learning.

03

Self-supervised evaluation shows clear advantages of proposed methods.

Abstract

Recurrent Neural Networks (RNNs) are general-purpose parallel-sequential computers. The program of an RNN is its weight matrix. How to learn useful representations of RNN weights that facilitate RNN analysis as well as downstream tasks? While the mechanistic approach directly looks at some RNN's weights to predict its behavior, the functionalist approach analyzes its overall functionality-specifically, its input-output mapping. We consider several mechanistic approaches for RNN weights and adapt the permutation equivariant Deep Weight Space layer for RNNs. Our two novel functionalist approaches extract information from RNN weights by 'interrogating' the RNN through probing inputs. We develop a theoretical framework that demonstrates conditions under which the functionalist approach can generate rich representations that help determine RNN behavior. We release the first two 'model zoo'…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

vincentherrmann/rnn-weights-representation-learning
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications