The Artificial Mind's Eye: Resisting Adversarials for Convolutional Neural Networks using Internal Projection
Harm Berntsen, Wouter Kuijper, Tom Heskes

TL;DR
This paper proposes a new neural network architecture that enhances robustness against adversarial attacks by generating and comparing internal class representations, providing a form of proof for object presence.
Contribution
It introduces an innovative architecture that incorporates internal projection and comparison to resist adversarial inputs in convolutional neural networks.
Findings
Improved robustness to adversarial attacks demonstrated.
Network provides internal 'proof' of object presence.
Method enhances interpretability and reliability.
Abstract
We introduce a novel artificial neural network architecture that integrates robustness to adversarial input in the network structure. The main idea of our approach is to force the network to make predictions on what the given instance of the class under consideration would look like and subsequently test those predictions. By forcing the network to redraw the relevant parts of the image and subsequently comparing this new image to the original, we are having the network give a "proof" of the presence of the object.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Anomaly Detection Techniques and Applications
