Generalization ability and Vulnerabilities to adversarial perturbations: Two sides of the same coin
Jung Hoon Lee, Sujith Vijayan

TL;DR
This paper investigates the internal representations of deep neural networks using self-organizing maps, revealing how layer-wise transformations relate to their generalization and vulnerability to adversarial attacks.
Contribution
It introduces a SOM-based analysis method to understand DNN internal codes and links homogeneous codes to adversarial vulnerabilities.
Findings
Shallow layers produce homogeneous internal codes.
Deep layers transform these into diverse codes.
Homogeneous codes may cause adversarial vulnerabilities.
Abstract
Deep neural networks (DNNs), the agents of deep learning (DL), require a massive number of parallel/sequential operations, which makes it difficult to comprehend them and impedes proper diagnosis. Without better knowledge of DNNs' internal process, deploying DNNs in high-stakes domains may lead to catastrophic failures. Therefore, to build more reliable DNNs/DL, it is imperative that we gain insights into their underlying decision-making process. Here, we use the self-organizing map (SOM) to analyze DL models' internal codes associated with DNNs' decision-making. Our analyses suggest that shallow layers close to the input layer map onto homogeneous codes and that deep layers close to the output layer transform these homogeneous codes in shallow layers to diverse codes. We also found evidence indicating that homogeneous codes may underlie DNNs' vulnerabilities to adversarial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Integrated Circuits and Semiconductor Failure Analysis · Neural Networks and Applications
MethodsSelf-Organizing Map
