Model Doctor: A Simple Gradient Aggregation Strategy for Diagnosing and Treating CNN Classifiers
Zunlei Feng, Jiacong Hu, Sai Wu, Xiaotian Yu, Jie Song, Mingli Song

TL;DR
This paper introduces Model Doctor, an automatic tool that diagnoses and improves CNN classifiers by using a gradient aggregation strategy, leading to consistent accuracy enhancements across multiple models.
Contribution
It presents the first fully automatic diagnosing and treating method for CNNs, leveraging a novel gradient aggregation approach based on new insights about feature kernel correlations and adversarial sample distribution.
Findings
Improves 16 CNN classifiers by 1-5% accuracy.
Applicable to all mainstream CNN architectures.
Demonstrates effectiveness through extensive experiments.
Abstract
Recently, Convolutional Neural Network (CNN) has achieved excellent performance in the classification task. It is widely known that CNN is deemed as a 'black-box', which is hard for understanding the prediction mechanism and debugging the wrong prediction. Some model debugging and explanation works are developed for solving the above drawbacks. However, those methods focus on explanation and diagnosing possible causes for model prediction, based on which the researchers handle the following optimization of models manually. In this paper, we propose the first completely automatic model diagnosing and treating tool, termed as Model Doctor. Based on two discoveries that 1) each category is only correlated with sparse and specific convolution kernels, and 2) adversarial samples are isolated while normal samples are successive in the feature space, a simple aggregate gradient constraint is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Advanced Neural Network Applications
MethodsConvolution
