CoDA: Exploring Chain-of-Distribution Attacks and Post-Hoc Token-Space Repair for Medical Vision-Language Models
Xiang Chen, Fangfang Yang, Chunlei Meng, Yuxian Dong, Ang Li, Yiwei Wei, Jiahuan Long, Jiujiang Guo, Chengyin Hu

TL;DR
This paper introduces CoDA, a framework for simulating realistic clinical image shifts to evaluate and improve the robustness of medical vision-language models, revealing significant vulnerabilities and proposing a repair method.
Contribution
The paper presents CoDA, a novel pipeline shift simulation method for medical images, and a post-hoc token-space repair strategy to enhance model robustness and reliability.
Findings
CoDA significantly degrades MVLM performance under realistic clinical shifts.
Chained pipeline compositions are more damaging than individual stages.
Post-hoc repair improves accuracy on CoDA-shifted images.
Abstract
Medical vision--language models (MVLMs) are increasingly used as perceptual backbones in radiology pipelines and as the visual front end of multimodal assistants, yet their reliability under real clinical workflows remains underexplored. Prior robustness evaluations often assume clean, curated inputs or study isolated corruptions, overlooking routine acquisition, reconstruction, display, and delivery operations that preserve clinical readability while shifting image statistics. To address this gap, we propose CoDA, a chain-of-distribution framework that constructs clinically plausible pipeline shifts by composing acquisition-like shading, reconstruction and display remapping, and delivery and export degradations. Under masked structural-similarity constraints, CoDA jointly optimizes stage compositions and parameters to induce failures while preserving visual plausibility. Across brain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
