A practical generalization metric for deep networks benchmarking
Mengqing Huang, Hongchuan Yu, Jianjun Zhang

TL;DR
This paper introduces a practical metric for benchmarking the generalization ability of deep networks, emphasizing the importance of data diversity and accuracy, and highlights discrepancies between theoretical estimations and practical measurements.
Contribution
It proposes a novel practical generalization metric and a benchmarking testbed, revealing gaps between theoretical predictions and actual model performance.
Findings
Generalization depends on accuracy and data diversity.
Most theoretical estimations do not align with practical measurements.
The metric provides an intuitive way to evaluate models and data diversity.
Abstract
There is an ongoing and dedicated effort to estimate bounds on the generalization error of deep learning models, coupled with an increasing interest with practical metrics that can be used to experimentally evaluate a model's ability to generalize. This interest is not only driven by practical considerations but is also vital for theoretical research, as theoretical estimations require practical validation. However, there is currently a lack of research on benchmarking the generalization capacity of various deep networks and verifying these theoretical estimations. This paper aims to introduce a practical generalization metric for benchmarking different deep networks and proposes a novel testbed for the verification of theoretical estimations. Our findings indicate that a deep network's generalization capacity in classification tasks is contingent upon both classification accuracy and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
