Optimizing Neural Networks through Activation Function Discovery and Automatic Weight Initialization
Garrett Bingham

TL;DR
This paper advances AutoML by introducing techniques for discovering specialized activation functions and robust weight initialization, leading to improved neural network performance and offering new insights into neural network optimization.
Contribution
It presents methods for automatic discovery of activation functions and weight initialization strategies, emphasizing specialization and joint optimization for better neural network performance.
Findings
Specialized solutions outperform general approaches.
Joint optimization of components yields better results.
Learned representations are easier to optimize.
Abstract
Automated machine learning (AutoML) methods improve upon existing models by optimizing various aspects of their design. While present methods focus on hyperparameters and neural network topologies, other aspects of neural network design can be optimized as well. To further the state of the art in AutoML, this dissertation introduces techniques for discovering more powerful activation functions and establishing more robust weight initialization for neural networks. These contributions improve performance, but also provide new perspectives on neural network optimization. First, the dissertation demonstrates that discovering solutions specialized to specific architectures and tasks gives better performance than reusing general approaches. Second, it shows that jointly optimizing different components of neural networks is synergistic, and results in better performance than optimizing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Neural Networks and Applications
