A framework for measuring the training efficiency of a neural   architecture

Eduardo Cueto-Mendoza; John D. Kelleher

arXiv:2409.07925·cs.LG·September 13, 2024

A framework for measuring the training efficiency of a neural architecture

Eduardo Cueto-Mendoza, John D. Kelleher

PDF

Open Access

TL;DR

This paper introduces an experimental framework to measure neural architecture training efficiency, revealing how efficiency varies with training progress, stopping criteria, and model complexity across CNNs and Bayesian models.

Contribution

The paper proposes a novel framework for assessing training efficiency and demonstrates its application on CNNs and Bayesian models with insights into efficiency decay and architecture comparison.

Findings

01

Training efficiency decays as training progresses.

02

CNNs are more efficient than Bayesian CNNs on MNIST and CIFAR-10.

03

Efficiency differences become more pronounced with increased task complexity.

Abstract

Measuring Efficiency in neural network system development is an open research problem. This paper presents an experimental framework to measure the training efficiency of a neural architecture. To demonstrate our approach, we analyze the training efficiency of Convolutional Neural Networks and Bayesian equivalents on the MNIST and CIFAR-10 tasks. Our results show that training efficiency decays as training progresses and varies across different stopping criteria for a given neural model and learning task. We also find a non-linear relationship between training stopping criteria, training Efficiency, model size, and training Efficiency. Furthermore, we illustrate the potential confounding effects of overtraining on measuring the training efficiency of a neural architecture. Regarding relative training efficiency across different architectures, our results indicate that CNNs are more…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications