Technical Report: Quantifying and Analyzing the Generalization Power of a DNN
Yuxuan He, Junpeng Zhang, Lei Cheng, Hongyuan Zhang, Quanshi Zhang

TL;DR
This paper introduces a novel method to quantify and analyze the generalization power of DNN interactions during training, revealing a three-phase dynamic process that distinguishes between generalizable and non-generalizable interactions.
Contribution
It provides a new perspective and an efficient approach to dissect the dynamics of DNN generalization, building on recent explainable AI theories.
Findings
Early training removes non-generalizable interactions.
Later phases learn complex, harder-to-generalize interactions.
Non-generalizable interactions cause the gap between training and testing losses.
Abstract
This paper proposes a new perspective for analyzing the generalization power of deep neural networks (DNNs), i.e., directly disentangling and analyzing the dynamics of generalizable and non-generalizable interaction encoded by a DNN through the training process. Specifically, this work builds upon the recent theoretical achievement in explainble AI, which proves that the detailed inference logic of DNNs can be can be strictly rewritten as a small number of AND-OR interaction patterns. Based on this, we propose an efficient method to quantify the generalization power of each interaction, and we discover a distinct three-phase dynamics of the generalization power of interactions during training. In particular, the early phase of training typically removes noisy and non-generalizable interactions and learns simple and generalizable ones. The second and the third phases tend to capture…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Generative Adversarial Networks and Image Synthesis · Advanced Graph Neural Networks
