Analysis and Optimization of Convolutional Neural Network Architectures
Martin Thoma

TL;DR
This paper analyzes CNN architectures, introduces visualization techniques, evaluates hierarchical classifiers, and develops a compact model that outperforms state-of-the-art on several benchmarks.
Contribution
It provides a comprehensive overview of CNN analysis methods, introduces a new visualization approach, and presents a compact, high-performing CNN model.
Findings
Smaller batch sizes improve accuracy.
Ensemble averaging and data augmentation enhance performance.
Learned color transformations did not significantly impact accuracy.
Abstract
Convolutional Neural Networks (CNNs) dominate various computer vision tasks since Alex Krizhevsky showed that they can be trained effectively and reduced the top-5 error from 26.2 % to 15.3 % on the ImageNet large scale visual recognition challenge. Many aspects of CNNs are examined in various publications, but literature about the analysis and construction of neural network architectures is rare. This work is one step to close this gap. A comprehensive overview over existing techniques for CNN analysis and topology construction is provided. A novel way to visualize classification errors with confusion matrices was developed. Based on this method, hierarchical classifiers are described and evaluated. Additionally, some results are confirmed and quantified for CIFAR-100. For example, the positive impact of smaller batch sizes, averaging ensembles, data augmentation and test-time…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Anomaly Detection Techniques and Applications · Machine Learning and Data Classification
