VanillaNet: the Power of Minimalism in Deep Learning
Hanting Chen, Yunhe Wang, Jianyuan Guo, Dacheng Tao

TL;DR
VanillaNet demonstrates that simple, minimalistic neural network architectures can achieve performance comparable to complex models like transformers, offering an efficient alternative for resource-constrained environments.
Contribution
This paper introduces VanillaNet, a straightforward neural network architecture that avoids complex operations, highlighting the effectiveness of minimalism in deep learning.
Findings
VanillaNet matches the performance of advanced neural networks.
The architecture is highly resource-efficient.
Pruning nonlinear activations after training maintains performance.
Abstract
At the heart of foundation models is the philosophy of "more is different", exemplified by the astonishing success in computer vision and natural language processing. However, the challenges of optimization and inherent complexity of transformer models call for a paradigm shift towards simplicity. In this study, we introduce VanillaNet, a neural network architecture that embraces elegance in design. By avoiding high depth, shortcuts, and intricate operations like self-attention, VanillaNet is refreshingly concise yet remarkably powerful. Each layer is carefully crafted to be compact and straightforward, with nonlinear activation functions pruned after training to restore the original architecture. VanillaNet overcomes the challenges of inherent complexity, making it ideal for resource-constrained environments. Its easy-to-understand and highly simplified architecture opens new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Neural Network Applications · EEG and Brain-Computer Interfaces · COVID-19 diagnosis using AI
