Utilizing Lyapunov Exponents in designing deep neural networks
Tirthankar Mittra

TL;DR
This paper explores using Lyapunov exponents to improve deep neural network training by guiding hyperparameter selection and initial weight choices, potentially accelerating convergence and reducing resource use.
Contribution
It introduces a novel approach of employing Lyapunov exponents to analyze and select hyperparameters and initial weights in deep neural networks.
Findings
Chaotic changes in weights linked to learning rate variations.
Activation functions with more negative Lyapunov exponents improve convergence.
Lyapunov exponents can guide initial weight selection for better training outcomes.
Abstract
Training large deep neural networks is resource intensive. This study investigates whether Lyapunov exponents can accelerate this process by aiding in the selection of hyperparameters. To study this I formulate an optimization problem using neural networks with different activation functions in the hidden layers. By initializing model weights with different random seeds, I calculate the Lyapunov exponent while performing traditional gradient descent on these model weights. The findings demonstrate that variations in the learning rate can induce chaotic changes in model weights. I also show that activation functions with more negative Lyapunov exponents exhibit better convergence properties. Additionally, the study also demonstrates that Lyapunov exponents can be utilized to select effective initial model weights for deep neural networks, potentially enhancing the optimization process.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
