Towards a universal mechanism for successful deep learning

Yuval Meir; Yarden Tzach; Shiri Hodassman; Ofek Tevet; Ido Kanter

arXiv:2309.07537·cs.CV·March 13, 2024

Towards a universal mechanism for successful deep learning

Yuval Meir, Yarden Tzach, Shiri Hodassman, Ofek Tevet, Ido Kanter

PDF

Open Access

TL;DR

This paper verifies a universal mechanism in deep learning where filters progressively sharpen their output clusters, increasing signal-to-noise ratio and accuracy across different architectures and datasets, suggesting potential for architecture simplification.

Contribution

It extends the understanding of a filter-based mechanism for successful deep learning across multiple architectures and datasets, demonstrating its universality and implications for model simplification.

Findings

01

Accuracy and SNR increase with depth in models.

02

Error rate scales linearly with number of labels.

03

Mechanism is consistent across datasets from 3 to 1000 labels.

Abstract

Recently, the underlying mechanism for successful deep learning (DL) was presented based on a quantitative method that measures the quality of a single filter in each layer of a DL model, particularly VGG-16 trained on CIFAR-10. This method exemplifies that each filter identifies small clusters of possible output labels, with additional noise selected as labels outside the clusters. This feature is progressively sharpened with each layer, resulting in an enhanced signal-to-noise ratio (SNR), which leads to an increase in the accuracy of the DL network. In this study, this mechanism is verified for VGG-16 and EfficientNet-B0 trained on the CIFAR-100 and ImageNet datasets, and the main results are as follows. First, the accuracy and SNR progressively increase with the layers. Second, for a given deep architecture, the maximal error rate increases approximately linearly with the number of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications