
TL;DR
This paper introduces quiver neural networks, a unified theoretical framework inspired by quiver representation theory, to analyze complex neural architectures and develop lossless model compression techniques.
Contribution
It presents a novel mathematical approach to neural network analysis and demonstrates lossless compression using parameter space symmetries for specific activation functions.
Findings
Proves lossless model compression for certain activations.
Shows equivalence of training compressed and original models.
Provides a unified framework for complex network architectures.
Abstract
We develop a uniform theoretical approach towards the analysis of various neural network connectivity architectures by introducing the notion of a quiver neural network. Inspired by quiver representation theory in mathematics, this approach gives a compact way to capture elaborate data flows in complex network architectures. As an application, we use parameter space symmetries to prove a lossless model compression algorithm for quiver neural networks with certain non-pointwise activations known as rescaling activations. In the case of radial rescaling activations, we prove that training the compressed model with gradient descent is equivalent to training the original model with projected gradient descent.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Seismic Imaging and Inversion Techniques · Model Reduction and Neural Networks
