Pseudo-Prompt Generating in Pre-trained Vision-Language Models for   Multi-Label Medical Image Classification

Yaoqin Ye; Junjie Zhang; Hongwei Shi

arXiv:2405.06468·cs.CV·September 16, 2024

Pseudo-Prompt Generating in Pre-trained Vision-Language Models for Multi-Label Medical Image Classification

Yaoqin Ye, Junjie Zhang, Hongwei Shi

PDF

Open Access 1 Repo

TL;DR

This paper introduces PsPG, a novel method that automatically generates class-specific prompts for pre-trained vision-language models to improve multi-label medical image classification, especially for unseen categories.

Contribution

The paper proposes PsPG, a text generation-inspired prompt generation approach that enhances zero-shot multi-label classification in medical imaging by leveraging multi-modal feature priors.

Findings

01

PsPG outperforms existing prompt learning methods on chest radiograph datasets.

02

The approach effectively generalizes to unseen categories in multi-label classification.

03

Automated pseudo-prompt generation reduces reliance on manual prompt design.

Abstract

The task of medical image recognition is notably complicated by the presence of varied and multiple pathological indications, presenting a unique challenge in multi-label classification with unseen labels. This complexity underlines the need for computer-aided diagnosis methods employing multi-label zero-shot learning. Recent advancements in pre-trained vision-language models (VLMs) have showcased notable zero-shot classification abilities on medical images. However, these methods have limitations on leveraging extensive pre-trained knowledge from broader image datasets, and often depend on manual prompt construction by expert radiologists. By automating the process of prompt tuning, prompt learning techniques have emerged as an efficient way to adapt VLMs to downstream tasks. Yet, existing CoOp-based strategies fall short in performing class-specific prompts on unseen categories,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fallingnight/pspg
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques