Hierarchical Vision-Language Learning for Medical Out-of-Distribution Detection
Runhe Lai, Xinhua Lu, Kanghao Chen, Qichao Chen, Wei-Shi Zheng, Ruixuan Wang

TL;DR
This paper introduces a hierarchical vision-language framework for medical out-of-distribution detection, enhancing the identification of unknown diseases by integrating multi-scale visual features and generating hard pseudo-OOD samples.
Contribution
It proposes a novel cross-scale visual fusion and pseudo-OOD sample generation strategy to improve medical OOD detection using vision-language models.
Findings
Outperforms existing methods on three public datasets
Enriches medical image representations with multi-scale features
Effective in detecting challenging unknown diseases
Abstract
In trustworthy medical diagnosis systems, integrating out-of-distribution (OOD) detection aims to identify unknown diseases in samples, thereby mitigating the risk of misdiagnosis. In this study, we propose a novel OOD detection framework based on vision-language models (VLMs), which integrates hierarchical visual information to cope with challenging unknown diseases that resemble known diseases. Specifically, a cross-scale visual fusion strategy is proposed to couple visual embeddings from multiple scales. This enriches the detailed representation of medical images and thus improves the discrimination of unknown diseases. Moreover, a cross-scale hard pseudo-OOD sample generation strategy is proposed to benefit OOD detection maximally. Experimental evaluations on three public medical datasets support that the proposed framework achieves superior OOD detection performance compared to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
