Rediscovering Deep Neural Networks Through Finite-State Distributions
Amir Emad Marvasti, Ehsan Emad Marvasti, George Atia, Hassan Foroosh

TL;DR
This paper introduces a probabilistic framework for deep neural networks using finite-state distributions, providing a principled basis for components like linear layers, activations, and pooling, and enabling exact information-theoretic analysis.
Contribution
The authors present a novel probabilistic interpretation of neural networks with finite-state distributions, deriving network components from principles in probability theory, unlike heuristic-based designs.
Findings
Exact computation of entropy and KLD is possible with FSDs.
ReLU and Sigmoid are supported by normalization layers derived in the framework.
Max Pooling approximates marginalization of spatial variables.
Abstract
We propose a new way of thinking about deep neural networks, in which the linear and non-linear components of the network are naturally derived and justified in terms of principles in probability theory. In particular, the models constructed in our framework assign probabilities to uncertain realizations, leading to Kullback-Leibler Divergence (KLD) as the linear layer. In our model construction, we also arrive at a structure similar to ReLU activation supported with Bayes' theorem. The non-linearities in our framework are normalization layers with ReLU and Sigmoid as element-wise approximations. Additionally, the pooling function is derived as a marginalization of spatial random variables according to the mechanics of the framework. As such, Max Pooling is an approximation to the aforementioned marginalization process. Since our models are comprised of finite state distributions (FSD)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Gaussian Processes and Bayesian Inference · Generative Adversarial Networks and Image Synthesis
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Max Pooling
