Mitigating the Impact of Adversarial Attacks in Very Deep Networks
Mohammed Hassanin, Ibrahim Radwan, Nour Moustafa, Murat Tahtali,, Neeraj Kumar

TL;DR
This paper introduces a novel defense mechanism combining a Defensive Feature Layer and Polarized Contrastive Loss to improve the robustness of deep neural networks against data poisoning attacks, demonstrating superior performance on CIFAR-10 and MNIST datasets.
Contribution
It proposes an attack-agnostic defense method integrating a Defensive Feature Layer and Polarized Contrastive Loss to mitigate data poisoning effects in deep networks.
Findings
Enhanced robustness against data poisoning attacks
Improved classification accuracy on attacked datasets
Outperforms recent peer techniques
Abstract
Deep Neural Network (DNN) models have vulnerabilities related to security concerns, with attackers usually employing complex hacking techniques to expose their structures. Data poisoning-enabled perturbation attacks are complex adversarial ones that inject false data into models. They negatively impact the learning process, with no benefit to deeper networks, as they degrade a model's accuracy and convergence rates. In this paper, we propose an attack-agnostic-based defense method for mitigating their influence. In it, a Defensive Feature Layer (DFL) is integrated with a well-known DNN architecture which assists in neutralizing the effects of illegitimate perturbation samples in the feature space. To boost the robustness and trustworthiness of this method for correctly classifying attacked input samples, we regularize the hidden space of a trained model with a discriminative loss…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
