Residual networks classify inputs based on their neural transient dynamics
Fereshteh Lagzi

TL;DR
This paper analyzes residual networks as dynamical systems, revealing how their transient residual dynamics contribute to classification, and introduces a method to optimize network depth during training for improved robustness and efficiency.
Contribution
It provides an analytical framework for understanding residual networks' classification based on transient dynamics and proposes a new method for adaptive depth adjustment during training.
Findings
Residual networks classify based on residual dynamics integration.
ResNets are more robust to input noise than MLPs.
Pruning residual network depth maintains high classification accuracy.
Abstract
We analyze the input-output behavior of residual networks from a dynamical system point of view by disentangling the residual dynamics from the output activities before the classification stage. For a network with simple skip connections between every successive layer, and for logistic activation function, and shared weights between layers, we show analytically that there is a cooperation and competition dynamics between residuals corresponding to each input dimension. Interpreting these kind of networks as nonlinear filters, the steady state value of the residuals in the case of attractor networks are indicative of the common features between different input dimensions that the network has observed during training, and has encoded in those components. In cases where residuals do not converge to an attractor state, their internal dynamics are separable for each input class, and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Reservoir Computing · Neural dynamics and brain function · Neural Networks and Applications
MethodsPruning · Average Pooling · 1x1 Convolution · Residual Connection · Convolution · Max Pooling · *Communicated@Fast*How Do I Communicate to Expedia? · Kaiming Initialization · Bottleneck Residual Block · Global Average Pooling
