SmartMixed: A Two-Phase Training Strategy for Adaptive Activation Function Learning in Neural Networks
Amin Omidvar

TL;DR
SmartMixed introduces a two-phase training strategy enabling neural networks to learn and select optimal per-neuron activation functions, enhancing adaptability and efficiency without sacrificing inference speed.
Contribution
It proposes a novel two-phase training method for adaptive activation function learning that maintains computational efficiency at inference.
Findings
Neurons in different layers prefer different activation functions.
SmartMixed achieves improved performance on MNIST dataset.
The method reveals functional diversity in neural architectures.
Abstract
The choice of activation function plays a critical role in neural networks, yet most architectures still rely on fixed, uniform activation functions across all neurons. We introduce SmartMixed, a two-phase training strategy that allows networks to learn optimal per-neuron activation functions while preserving computational efficiency at inference. In the first phase, neurons adaptively select from a pool of candidate activation functions (ReLU, Sigmoid, Tanh, Leaky ReLU, ELU, SELU) using a differentiable hard-mixture mechanism. In the second phase, each neuron's activation function is fixed according to the learned selection, resulting in a computationally efficient network that supports continued training with optimized vectorized operations. We evaluate SmartMixed on the MNIST dataset using feedforward neural networks of varying depths. The analysis shows that neurons in different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Advanced Neural Network Applications · Machine Learning in Materials Science
