Game-Theoretic Understanding of Misclassification
Kosuke Sumiyasu, Kazuhiko Kawamoto, Hiroshi Kera

TL;DR
This paper uses game theory to analyze different types of image misclassification, revealing how adversarial and corrupted images differ in pixel interactions and comparing CNNs with Vision Transformers.
Contribution
It introduces a novel game-theoretic framework to analyze misclassification types and applies it to both CNNs and Vision Transformers, uncovering distinct interaction patterns.
Findings
Adversarial images have higher high-order interactions than clean images.
Corrupted images exhibit lower low-order interactions than clean images.
Vision Transformers show different interaction distributions compared to CNNs.
Abstract
This paper analyzes various types of image misclassification from a game-theoretic view. Particularly, we consider the misclassification of clean, adversarial, and corrupted images and characterize it through the distribution of multi-order interactions. We discover that the distribution of multi-order interactions varies across the types of misclassification. For example, misclassified adversarial images have a higher strength of high-order interactions than correctly classified clean images, which indicates that adversarial perturbations create spurious features that arise from complex cooperation between pixels. By contrast, misclassified corrupted images have a lower strength of low-order interactions than correctly classified clean images, which indicates that corruptions break the local cooperation between pixels. We also provide the first analysis of Vision Transformers using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications
