A Guide to Robust Generalization: The Impact of Architecture, Pre-training, and Optimization Strategy
Maxime Heuillet, Rishika Bhagwatkar, Jonas Ngnaw\'e, Yann Pequignot, Alexandre Larouche, Christian Gagn\'e, Irina Rish, Ola Ahmad, Audrey Durand

TL;DR
This paper provides a comprehensive empirical analysis of how architecture, pretraining, and optimization choices affect the robustness of deep learning models to input perturbations, offering practical insights for improving generalization.
Contribution
It presents the most diverse benchmark to date on robust fine-tuning, analyzing 1,440 configurations across multiple datasets, architectures, and perturbations, revealing key factors influencing robustness.
Findings
Supervised pretrained CNNs often outperform attention-based models in robustness.
Design choices like architecture and loss functions significantly impact generalization to unseen perturbations.
The study offers practical guidance for selecting model and training strategies to enhance robustness.
Abstract
Deep learning models operating in the image domain are vulnerable to small input perturbations. For years, robustness to such perturbations was pursued by training models from scratch (i.e., with random initializations) using specialized loss objectives. Recently, robust fine-tuning has emerged as a more efficient alternative: instead of training from scratch, pretrained models are adapted to maximize predictive performance and robustness. To conduct robust fine-tuning, practitioners design an optimization strategy that includes the model update protocol (e.g., full or partial) and the specialized loss objective. Additional design choices include the architecture type and size, and the pretrained representation. These design choices affect robust generalization, which is the model's ability to maintain performance when exposed to new and unseen perturbations at test time. Understanding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsManufacturing Process and Optimization
