CNS-Bench: Benchmarking Image Classifier Robustness Under Continuous Nuisance Shifts
Olaf D\"unkel, Artur Jesslen, Jiahao Xie, Christian Theobalt, Christian Rupprecht, Adam Kortylewski

TL;DR
CNS-Bench introduces a new benchmark for evaluating image classifier robustness against realistic, continuous nuisance shifts generated by diffusion models, revealing nuanced insights into model performance variations.
Contribution
The paper presents CNS-Bench, a novel benchmark that enables continuous nuisance shift evaluation using diffusion models with LoRA adapters, improving the realism and reliability of OOD robustness testing.
Findings
Model rankings vary across different nuisance shifts.
Continuous evaluation uncovers model failure points.
The filtering mechanism improves benchmarking reliability.
Abstract
An important challenge when using computer vision models in the real world is to evaluate their performance in potential out-of-distribution (OOD) scenarios. While simple synthetic corruptions are commonly applied to test OOD robustness, they often fail to capture nuisance shifts that occur in the real world. Recently, diffusion models have been applied to generate realistic images for benchmarking, but they are restricted to binary nuisance shifts. In this work, we introduce CNS-Bench, a Continuous Nuisance Shift Benchmark to quantify OOD robustness of image classifiers for continuous and realistic generative nuisance shifts. CNS-Bench allows generating a wide range of individual nuisance shifts in continuous severities by applying LoRA adapters to diffusion models. To address failure cases, we propose a filtering mechanism that outperforms previous methods, thereby enabling reliable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
