TL;DR
This paper investigates Hermite polynomial activations as an alternative to ReLUs in deep networks, demonstrating notable benefits in semi-supervised learning, including improved pseudo-label accuracy and robustness, supported by theoretical analysis.
Contribution
It introduces Hermite polynomial activations for deep networks, showing their advantages in semi-supervised learning and providing theoretical insights into their robustness and mathematical properties.
Findings
Hermite activations improve pseudo-label accuracy in SSL.
Networks with Hermite activations are more robust to noise.
Hermite-based models offer runtime and financial savings.
Abstract
Rectified Linear Units (ReLUs) are among the most widely used activation function in a broad variety of tasks in vision. Recent theoretical results suggest that despite their excellent practical performance, in various cases, a substitution with basis expansions (e.g., polynomials) can yield significant benefits from both the optimization and generalization perspective. Unfortunately, the existing results remain limited to networks with a couple of layers, and the practical viability of these results is not yet known. Motivated by some of these results, we explore the use of Hermite polynomial expansions as a substitute for ReLUs in deep networks. While our experiments with supervised learning do not provide a clear verdict, we find that this strategy offers considerable benefits in semi-supervised learning (SSL) / transductive learning settings. We carefully develop this idea and show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsHermite Polynomial Activation
