Wormhole Dynamics in Deep Neural Networks
Yen-Lung Lai, Zhe Jin

TL;DR
This paper introduces a novel analytical framework to understand deep neural network generalization, revealing how overparameterization causes feature collapse and degeneracy, and proposes a 'wormhole' solution to bypass trivial solutions and better interpret fooling examples.
Contribution
It presents a new analytical approach based on maximum likelihood estimation that uncovers the wormhole solution, offering insights into DNN degeneracy and generalization behavior.
Findings
Overparameterized DNNs exhibit feature space collapse.
Adding layers can lead to trivial, degenerate solutions.
The wormhole solution can bypass degeneracy and interpret fooling examples.
Abstract
This work investigates the generalization behavior of deep neural networks (DNNs), focusing on the phenomenon of "fooling examples," where DNNs confidently classify inputs that appear random or unstructured to humans. To explore this phenomenon, we introduce an analytical framework based on maximum likelihood estimation, without adhering to conventional numerical approaches that rely on gradient-based optimization and explicit labels. Our analysis reveals that DNNs operating in an overparameterized regime exhibit a collapse in the output feature space. While this collapse improves network generalization, adding more layers eventually leads to a state of degeneracy, where the model learns trivial solutions by mapping distinct inputs to the same output, resulting in zero loss. Further investigation demonstrates that this degeneracy can be bypassed using our newly derived "wormhole"…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
