Truncating Wide Networks using Binary Tree Architectures
Yan Zhang, Mete Ozay, Shuohao Li, Takayuki Okatani

TL;DR
This paper introduces a binary tree architecture to truncate wide networks, reducing parameters while maintaining or improving accuracy, thus enhancing the efficiency and expressive capacity of deep learning models.
Contribution
The paper proposes a novel binary tree architecture that reduces the width of wide networks from bottom to top, improving parameter efficiency and accuracy trade-offs.
Findings
Achieved 19.22% error on Cifar-100 with only 28% of baseline parameters.
Improved parameter size and accuracy trade-off over baseline networks.
Enhanced expressive capacity by concatenating features from different layers.
Abstract
Recent study shows that a wide deep network can obtain accuracy comparable to a deeper but narrower network. Compared to narrower and deeper networks, wide networks employ relatively less number of layers and have various important benefits, such that they have less running time on parallel computing devices, and they are less affected by gradient vanishing problems. However, the parameter size of a wide network can be very large due to use of large width of each layer in the network. In order to keep the benefits of wide networks meanwhile improve the parameter size and accuracy trade-off of wide networks, we propose a binary tree architecture to truncate architecture of wide networks by reducing the width of the networks. More precisely, in the proposed architecture, the width is continuously reduced from lower layers to higher layers in order to increase the expressive capacity of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
