On Dissipativity of Cross-Entropy Loss in Training ResNets
Jens P\"uttschneider, Timm Faulwasser

TL;DR
This paper introduces a dissipative optimal control framework for training ResNets and neural ODEs using a modified cross-entropy regularization, demonstrating the turnpike phenomenon and enabling the design of shallow networks for classification.
Contribution
It proposes a novel dissipative formulation of ResNet training incorporating a cross-entropy variant, and proves the occurrence of the turnpike phenomenon in trained networks.
Findings
ResNets exhibit the turnpike phenomenon during training.
Training on two spirals and MNIST datasets confirms the theoretical results.
Shallow networks can be effectively designed for specific classification tasks.
Abstract
The training of ResNets and neural ODEs can be formulated and analyzed from the perspective of optimal control. This paper proposes a dissipative formulation of the training of ResNets and neural ODEs for classification problems by including a variant of the cross-entropy as a regularization in the stage cost. Based on the dissipative formulation of the training, we prove that the trained ResNet exhibit the turnpike phenomenon. We then illustrate that the training exhibits the turnpike phenomenon by training on the two spirals and MNIST datasets. This can be used to find very shallow networks suitable for a given classification task.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsKaiming Initialization · Max Pooling · Average Pooling · Global Average Pooling · Convolution
