Training and Generating Neural Networks in Compressed Weight Space
Kazuki Irie, J\"urgen Schmidhuber

TL;DR
This paper explores compressing neural network weights using discrete cosine transform and recurrent neural networks, aiming to improve scalability and efficiency in character-level language modeling.
Contribution
It introduces a novel approach of encoding neural network weights with DCT and recurrent networks, and provides experimental results on enwik8 dataset.
Findings
Effective weight compression via DCT and RNN encoding
Potential for scalable neural network architectures
Positive results on enwik8 character-level modeling
Abstract
The inputs and/or outputs of some neural nets are weight matrices of other neural nets. Indirect encodings or end-to-end compression of weight matrices could help to scale such approaches. Our goal is to open a discussion on this topic, starting with recurrent neural networks for character-level language modelling whose weight matrices are encoded by the discrete cosine transform. Our fast weight version thereof uses a recurrent neural network to parameterise the compressed weights. We present experimental results on the enwik8 dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Handwritten Text Recognition Techniques
