Affine symmetries and neural network identifiability
Verner Vla\v{c}i\'c, Helmut B\"olcskei

TL;DR
This paper investigates the conditions under which neural network architectures can be uniquely identified from their functions, focusing on affine symmetries of nonlinearities and providing comprehensive results for certain classes of activation functions.
Contribution
It generalizes neural network identifiability results to arbitrary nonlinearities with affine symmetries, establishing when networks are uniquely determined by their functions.
Findings
Affine symmetries can be used to characterize all networks producing the same function.
For certain nonlinearities, the network is uniquely identifiable up to symmetries.
The paper provides a full solution for tanh-type nonlinearities regarding identifiability.
Abstract
We address the following question of neural network identifiability: Suppose we are given a function and a nonlinearity . Can we specify the architecture, weights, and biases of all feed-forward neural networks with respect to giving rise to ? Existing literature on the subject suggests that the answer should be yes, provided we are only concerned with finding networks that satisfy certain "genericity conditions". Moreover, the identified networks are mutually related by symmetries of the nonlinearity. For instance, the function is odd, and so flipping the signs of the incoming and outgoing weights of a neuron does not change the output map of the network. The results known hitherto, however, apply either to single-layer networks, or to networks satisfying specific structural assumptions (such as full connectivity), as well as to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
