TL;DR
This paper introduces SaGD, a sharpness-aware geometric defense framework that enhances out-of-distribution detection by smoothing the adversarial loss landscape, leading to improved robustness against attacks.
Contribution
The paper proposes a novel SaGD framework that improves OOD detection by addressing the rugged loss landscape caused by adversarial training, with extensions for unseen attacks.
Findings
Significant improvement in FPR and AUC over state-of-the-art methods.
Effective differentiation of CIFAR-100 from other OOD datasets under attack.
Insights into the relationship between loss landscape sharpness and OOD detection robustness.
Abstract
Out-of-distribution (OOD) detection ensures safe and reliable model deployment. Contemporary OOD algorithms using geometry projection can detect OOD or adversarial samples from clean in-distribution (ID) samples. However, this setting regards adversarial ID samples as OOD, leading to incorrect OOD predictions. Existing efforts on OOD detection with ID and OOD data under attacks are minimal. In this paper, we develop a robust OOD detection method that distinguishes adversarial ID samples from OOD ones. The sharp loss landscape created by adversarial training hinders model convergence, impacting the latent embedding quality for OOD score calculation. Therefore, we introduce a {\bf Sharpness-aware Geometric Defense (SaGD)} framework to smooth out the rugged adversarial loss landscape in the projected latent geometry. Enhanced geometric embedding convergence enables accurate ID data…
Peer Reviews
Decision·ICLR 2025 Conference Withdrawn Submission
- The paper presents a novel angle by examining OOD detection under potential adversarial attacks - a scenario that has received limited attention. - The experimental evaluation is comprehensive and thorough.
## Should adversarial examples be classified as in-distribution samples rather than outliers? It is clear that by adding adversarial perturbation, the distribution shifted, why it still should be in-distribution? ## About the scenario What are some real-world applications where out-of-distribution detection must handle potentially adversarially attacked images? ## About the Contribution The contribution of this work should be carefully justified. Most of the subsection in section 3 are existi
1. The authors investigate various adversarial attacks on different OOD detection approaches. Extensive experiments demonstrate the effectiveness of the proposed method. 2. They introduce Jitter-based perturbation in adversarial training to extend the defense ability against unseen attacks. 3. They employ Multi-Geometry Projection (MGP) and Riemannian Sharpness-aware Minimization (RSAM) for the OOD detection.
1. My first concern is the reasonability of the research setting. The paper presents a method to classify adversarial examples as in-distribution (ID) samples in the context of out-of-distribution (OOD) detection. However, I find the rationale for this setting questionable for two main reasons: * Adversarial examples, by design, deviate significantly from the natural data distribution, even if they remain close in image space. Treating them as OOD samples aligns with standard OOD detection objec
1. It introduces a novel sharpness-aware method for improving OOD detection in adversarial training. The proposed method investigates the combination of Riemannian geometries under adversarial conditions. This expansion of geometry space sharpens the proposed defense against adversarial attacks and avoids reliance on large OOD datasets for auxiliary training. 2. The proposed SaGD sets a new SoTA for OOD detection, excelling in $FPR_{95}$ and AUC metrics, both with or without attacks. 3. It per
1. It should provide a detailed analysis of the computational complexity involved in computing the OOD score. Additionally, it is important to examine how the number of in-distribution (ID) training samples affects the performance of the OOD score, as this can influence the scalability and generalizability of the approach. 2. Choosing an appropriate threshold $\lambda$ for the OOD score can be challenging in real-world applications. The paper should include a clear, practical procedure for dete
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
