Leveraging Product as an Activation Function in Deep Networks
Luke B. Godfrey, Michael S. Gashler

TL;DR
This paper introduces windowed product unit neural networks (WPUNNs), a novel approach that uses product as a nonlinearity, enabling effective training and achieving competitive results on MNIST and recurrent tasks.
Contribution
The paper proposes WPUNNs, a simple method to incorporate product as a nonlinearity, overcoming training difficulties of traditional product units and extending to recurrent networks.
Findings
WPUNNs perform comparably to ReLU on MNIST.
WPUNNs generalize gated units in RNNs, matching LSTM performance.
Windowing the product stabilizes training of product-based networks.
Abstract
Product unit neural networks (PUNNs) are powerful representational models with a strong theoretical basis, but have proven to be difficult to train with gradient-based optimizers. We present windowed product unit neural networks (WPUNNs), a simple method of leveraging product as a nonlinearity in a neural network. Windowing the product tames the complex gradient surface and enables WPUNNs to learn effectively, solving the problems faced by PUNNs. WPUNNs use product layers between traditional sum layers, capturing the representational power of product units and using the product itself as a nonlinearity. We find the result that this method works as well as traditional nonlinearities like ReLU on the MNIST dataset. We demonstrate that WPUNNs can also generalize gated units in recurrent neural networks, yielding results comparable to LSTM networks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Machine Learning and Data Classification · Neural Networks and Applications
