FractalNet: Ultra-Deep Neural Networks without Residuals
Gustav Larsson, Michael Maire, Gregory Shakhnarovich

TL;DR
This paper presents FractalNet, a novel deep neural network architecture based on self-similar fractal structures that achieves competitive performance without residual connections, highlighting the importance of transitioning from shallow to deep during training.
Contribution
Introduces fractal-based macro-architecture for neural networks, demonstrating competitive results without residuals and proposing drop-path regularization for subpath co-adaptation.
Findings
FractalNet matches residual networks on CIFAR and ImageNet.
Regularization via drop-path improves subpath training.
Fractal networks have an anytime property with shallow and deep subnetworks.
Abstract
We introduce a design strategy for neural network macro-architecture based on self-similarity. Repeated application of a simple expansion rule generates deep networks whose structural layouts are precisely truncated fractals. These networks contain interacting subpaths of different lengths, but do not include any pass-through or residual connections; every internal signal is transformed by a filter and nonlinearity before being seen by subsequent layers. In experiments, fractal networks match the excellent performance of standard residual networks on both CIFAR and ImageNet classification tasks, thereby demonstrating that residual representations may not be fundamental to the success of extremely deep convolutional neural networks. Rather, the key may be the ability to transition, during training, from effectively shallow to deep. We note similarities with student-teacher behavior and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Generative Adversarial Networks and Image Synthesis · Digital Media Forensic Detection
MethodsConvolution · Batch Normalization · *Communicated@Fast*How Do I Communicate to Expedia? · Dense Connections · Max Pooling · Softmax · Random Horizontal Flip · Random Resized Crop · Step Decay · Xavier Initialization
