Locally adaptive activation functions with slope recovery term for deep and physics-informed neural networks
Ameya D. Jagtap, Kenji Kawaguchi, George Em Karniadakis

TL;DR
This paper introduces locally adaptive activation functions with a slope recovery term for deep and physics-informed neural networks, enhancing training speed, convergence, and avoiding sub-optimal critical points through theoretical and empirical analysis.
Contribution
It presents novel layer-wise and neuron-wise adaptive activation functions with a slope recovery term, improving training efficiency and convergence in neural networks.
Findings
Accelerated training convergence with slope recovery term.
Theoretical proof of avoiding sub-optimal critical points.
Implicit conditioning matrices improve optimization dynamics.
Abstract
We propose two approaches of locally adaptive activation functions namely, layer-wise and neuron-wise locally adaptive activation functions, which improve the performance of deep and physics-informed neural networks. The local adaptation of activation function is achieved by introducing a scalable parameter in each layer (layer-wise) and for every neuron (neuron-wise) separately, and then optimizing it using a variant of stochastic gradient descent algorithm. In order to further increase the training speed, an activation slope based slope recovery term is added in the loss function, which further accelerates convergence, thereby reducing the training cost. On the theoretical side, we prove that in the proposed method, the gradient descent algorithms are not attracted to sub-optimal critical points or local minima under practical conditions on the initialization and learning rate, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
