Extending ZACH-ViT to Robust Medical Imaging: Corruption and Adversarial Stress Testing in Low-Data Regimes
Athanasios Angelakis, Marta Gomez-Barrero

TL;DR
This paper extends ZACH-ViT to evaluate its robustness against corruptions and adversarial attacks in low-data medical imaging, demonstrating its competitive performance and robustness advantages over baseline models.
Contribution
It presents the first robustness-focused extension of ZACH-ViT, comparing its performance under corruptions and adversarial stress in low-data regimes.
Findings
ZACH-ViT achieves the best mean rank on clean and corrupted data.
ZACH-ViT remains competitive under adversarial attacks, ranking first under FGSM.
Adversarial robustness remains a challenge for all models evaluated.
Abstract
The recently introduced ZACH-ViT (Zero-token Adaptive Compact Hierarchical Vision Transformer) formalized a compact permutation-invariant Vision Transformer for medical imaging and argued that architectural alignment with spatial structure can matter more than universal benchmark dominance. Its design was motivated by the observation that positional embeddings and a dedicated class token encode fixed spatial assumptions that may be suboptimal when spatial organization is weakly informative, locally distributed, or variable across biomedical images. The foundational study established a regime-dependent clean performance profile across MedMNIST, but did not examine robustness in detail. In this work, we present the first robustness-focused extension of ZACH-ViT by evaluating its behavior under common image corruptions and adversarial perturbations in the same low-data setting. We compare…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
