TL;DR
This paper introduces coupled ensembles of neural networks, which reconfigure deep CNNs into parallel branches with tighter coupling, leading to improved performance and parameter efficiency across multiple datasets.
Contribution
Proposes a generic coupled ensemble architecture with tighter branch coupling, enhancing learning and performance without increasing parameters significantly.
Findings
Achieved state-of-the-art error rates on CIFAR-10, CIFAR-100, and SVHN.
Reduced parameters while maintaining or improving accuracy.
Demonstrated the approach's applicability to various CNN architectures.
Abstract
We investigate in this paper the architecture of deep convolutional networks. Building on existing state of the art models, we propose a reconfiguration of the model parameters into several parallel branches at the global network level, with each branch being a standalone CNN. We show that this arrangement is an efficient way to significantly reduce the number of parameters without losing performance or to significantly improve the performance with the same level of performance. The use of branches brings an additional form of regularization. In addition to the split into parallel branches, we propose a tighter coupling of these branches by placing the "fuse (averaging) layer" before the Log-Likelihood and SoftMax layers during training. This gives another significant performance improvement, the tighter coupling favouring the learning of better representations, even at the level of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDiffusion-Convolutional Neural Networks · Softmax
