DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs
Donghyun Kim, Byeongho Heo, Dongyoon Han

TL;DR
This paper revitalizes DenseNets by refining training methods and architecture, demonstrating they can outperform modern models like Swin Transformer and ConvNeXt on ImageNet-1K and other tasks.
Contribution
It introduces improved training recipes and architectural adjustments that significantly enhance DenseNets' performance, surpassing recent state-of-the-art models.
Findings
DenseNets outperform Swin Transformer, ConvNeXt, and DeiT-III.
Models achieve near state-of-the-art results on ImageNet-1K.
Empirical analysis favors concatenation over additive shortcuts.
Abstract
This paper revives Densely Connected Convolutional Networks (DenseNets) and reveals the underrated effectiveness over predominant ResNet-style architectures. We believe DenseNets' potential was overlooked due to untouched training methods and traditional design elements not fully revealing their capabilities. Our pilot study shows dense connections through concatenation are strong, demonstrating that DenseNets can be revitalized to compete with modern architectures. We methodically refine suboptimal components - architectural adjustments, block redesign, and improved training recipes towards widening DenseNets and boosting memory efficiency while keeping concatenation shortcuts. Our models, employing simple architectural elements, ultimately surpass Swin Transformer, ConvNeXt, and DeiT-III - key architectures in the residual learning lineage. Furthermore, our models exhibit near…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗naver-ai/rdnet_base.nv_in1kmodel· 141 dl· ♡ 2141 dl♡ 2
- 🤗naver-ai/rdnet_large.nv_in1kmodel· 32 dl· ♡ 132 dl♡ 1
- 🤗naver-ai/rdnet_large.nv_in1k_ft_in1k_384model· 36 dl· ♡ 136 dl♡ 1
- 🤗naver-ai/rdnet_tiny.nv_in1kmodel· 985 dl· ♡ 5985 dl♡ 5
- 🤗naver-ai/rdnet_small.nv_in1kmodel· 49 dl· ♡ 249 dl♡ 2
- 🤗birder-project/rdnet_s_arabian-peninsulamodel· 25 dl25 dl
- 🤗birder-project/rdnet_t_ibot-bioscan5mmodel· 97 dl· ♡ 197 dl♡ 1
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsAttention Is All You Need · RDNet · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Softmax · Dropout · Multi-Head Attention · Dense Connections
