A constrained optimization approach to improve robustness of neural networks
Shudian Zhao, Jan Kronqvist

TL;DR
This paper introduces a nonlinear programming method with adversary-correction constraints and a cutting-plane algorithm to enhance neural network robustness against adversarial attacks while preserving accuracy.
Contribution
It presents a novel optimization-based approach that fine-tunes pre-trained neural networks for improved adversarial robustness with minimal accuracy loss.
Findings
Significant robustness improvements on MNIST and CIFAR10
Effective with small adversarial datasets
Maintains high accuracy on clean data
Abstract
In this paper, we present a novel nonlinear programming-based approach to fine-tune pre-trained neural networks to improve robustness against adversarial attacks while maintaining high accuracy on clean data. Our method introduces adversary-correction constraints to ensure correct classification of adversarial data and minimizes changes to the model parameters. We propose an efficient cutting-plane-based algorithm to iteratively solve the large-scale nonconvex optimization problem by approximating the feasible region through polyhedral cuts and balancing between robustness and accuracy. Computational experiments on standard datasets such as MNIST and CIFAR10 demonstrate that the proposed approach significantly improves robustness, even with a very small set of adversarial data, while maintaining minimal impact on accuracy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsSparse Evolutionary Training
