Toward Robust Medical Fairness: Debiased Dual-Modal Alignment via Text-Guided Attribute-Disentangled Prompt Learning for Vision-Language Models
Yuexuan Xia, Benteng Ma, Jiang He, Zhiyong Wang, Qi Dou, Yong Xia

TL;DR
This paper introduces DualFairVL, a novel multimodal prompt-learning framework that jointly debiases and aligns vision-language representations to improve fairness and robustness in medical diagnosis across diverse datasets.
Contribution
The paper proposes a dual-branch architecture with disentangled sensitive and target attributes, and a hypernetwork for instance-aware prompts, advancing fairness in medical vision-language models.
Findings
Achieves state-of-the-art fairness and accuracy on eight datasets.
Outperforms full fine-tuning and parameter-efficient baselines.
Operates effectively with only 3.6 million trainable parameters.
Abstract
Ensuring fairness across demographic groups in medical diagnosis is essential for equitable healthcare, particularly under distribution shifts caused by variations in imaging equipment and clinical practice. Vision-language models (VLMs) exhibit strong generalization, and text prompts encode identity attributes, enabling explicit identification and removal of sensitive directions. However, existing debiasing approaches typically address vision and text modalities independently, leaving residual cross-modal misalignment and fairness gaps. To address this challenge, we propose DualFairVL, a multimodal prompt-learning framework that jointly debiases and aligns cross-modal representations. DualFairVL employs a parallel dual-branch architecture that separates sensitive and target attributes, enabling disentangled yet aligned representations across modalities. Approximately orthogonal text…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
