Utilizing Lyapunov Exponents in designing deep neural networks

Tirthankar Mittra

arXiv:2410.05988·cs.LG·October 10, 2024

Utilizing Lyapunov Exponents in designing deep neural networks

Tirthankar Mittra

PDF

Open Access 1 Repo

TL;DR

This paper explores using Lyapunov exponents to improve deep neural network training by guiding hyperparameter selection and initial weight choices, potentially accelerating convergence and reducing resource use.

Contribution

It introduces a novel approach of employing Lyapunov exponents to analyze and select hyperparameters and initial weights in deep neural networks.

Findings

01

Chaotic changes in weights linked to learning rate variations.

02

Activation functions with more negative Lyapunov exponents improve convergence.

03

Lyapunov exponents can guide initial weight selection for better training outcomes.

Abstract

Training large deep neural networks is resource intensive. This study investigates whether Lyapunov exponents can accelerate this process by aiding in the selection of hyperparameters. To study this I formulate an optimization problem using neural networks with different activation functions in the hidden layers. By initializing model weights with different random seeds, I calculate the Lyapunov exponent while performing traditional gradient descent on these model weights. The findings demonstrate that variations in the learning rate can induce chaotic changes in model weights. I also show that activation functions with more negative Lyapunov exponents exhibit better convergence properties. Additionally, the study also demonstrates that Lyapunov exponents can be utilized to select effective initial model weights for deep neural networks, potentially enhancing the optimization process.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tirthankar95/chaosoptim
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications