Activation Functions, Statistics and Learning of Higher-Order Interactions in Restricted Boltzmann Machines
Giovanni di Sarra, Yasser Roudi

TL;DR
This paper analyzes how different activation functions in Restricted Boltzmann Machines influence their ability to represent complex data interactions, providing analytical insights and practical implications.
Contribution
It offers an analytical characterization of the interaction statistics induced by various activation functions in RBMs, linking activation nonlinearities to learning capabilities.
Findings
Revealed the impact of activation functions on the representability of complex data structures.
Identified that exponential activation functions can better facilitate learning of high-order interactions.
Validated analytical predictions with simulation results during training.
Abstract
The great success of neural networks in recognizing hidden patterns and correlations in complex data lies in the way they take advantage of the large number of parameters and nonlinear single-unit activation, jointly. Restricted Boltzmann Machines (RBMs) provide a simple yet powerful framework for studying the impact of activation nonlinearities on performance and representation. In this work, we exploit the duality between RBMs and models of interacting binary variables to study the statistics of the interactions induced by RBM ensembles with different hidden unit activation functions. We characterize the space of representable models analytically in terms of moments of the distribution of induced interactions for four commonly used activation functions: Linear, Step, ReLU, and Exponential. Quantitative predictions of the analytical calculations on learning show a very good agreement…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
