Training and Generating Neural Networks in Compressed Weight Space

Kazuki Irie; J\"urgen Schmidhuber

arXiv:2112.15545·cs.LG·January 3, 2022·1 cites

Training and Generating Neural Networks in Compressed Weight Space

Kazuki Irie, J\"urgen Schmidhuber

PDF

Open Access 1 Repo

TL;DR

This paper explores compressing neural network weights using discrete cosine transform and recurrent neural networks, aiming to improve scalability and efficiency in character-level language modeling.

Contribution

It introduces a novel approach of encoding neural network weights with DCT and recurrent networks, and provides experimental results on enwik8 dataset.

Findings

01

Effective weight compression via DCT and RNN encoding

02

Potential for scalable neural network architectures

03

Positive results on enwik8 character-level modeling

Abstract

The inputs and/or outputs of some neural nets are weight matrices of other neural nets. Indirect encodings or end-to-end compression of weight matrices could help to scale such approaches. Our goal is to open a discussion on this topic, starting with recurrent neural networks for character-level language modelling whose weight matrices are encoded by the discrete cosine transform. Our fast weight version thereof uses a recurrent neural network to parameterise the compressed weights. We present experimental results on the enwik8 dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kazuki-irie/dct-fast-weights
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Handwritten Text Recognition Techniques