Foundation Model-oriented Robustness: Robust Image Model Evaluation with Pretrained Models
Peiyan Zhang, Haoyang Liu, Chaozhuo Li, Xing Xie, Sunghun Kim, Haohan, Wang

TL;DR
This paper proposes a new robustness evaluation method for image classification models that compares their performance to a foundation model acting as an oracle, extending evaluation beyond fixed benchmarks and constrained perturbations.
Contribution
It introduces a novel robustness measurement using foundation models as surrogates and a simple method to generate perturbed samples for comprehensive evaluation.
Findings
The new evaluation method provides a more realistic robustness assessment.
Generated data reveals insights into model behaviors.
Evaluation surpasses limitations of traditional benchmarks.
Abstract
Machine learning has demonstrated remarkable performance over finite datasets, yet whether the scores over the fixed benchmarks can sufficiently indicate the model's performance in the real world is still in discussion. In reality, an ideal robust model will probably behave similarly to the oracle (e.g., the human users), thus a good evaluation protocol is probably to evaluate the models' behaviors in comparison to the oracle. In this paper, we introduce a new robustness measurement that directly measures the image classification model's performance compared with a surrogate oracle (i.e., a foundation model). Besides, we design a simple method that can accomplish the evaluation beyond the scope of the benchmarks. Our method extends the image datasets with new samples that are sufficiently perturbed to be distinct from the ones in the original sets, but are still bounded within the same…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Adversarial Robustness in Machine Learning
