MediFact at MEDIQA-M3G 2024: Medical Question Answering in Dermatology with Multimodal Learning
Nadia Saeed

TL;DR
This paper introduces a multimodal, multilingual medical question-answering system for dermatology that uses weakly supervised learning and visual-text fusion models to generate comprehensive answers without predefined options.
Contribution
It presents a novel weakly supervised learning approach combining visual and textual data for open-ended dermatology QA across multiple languages.
Findings
Effective multilingual skin condition representations
Improved answer generation with multimodal fusion
Advances in open-ended medical question answering
Abstract
The MEDIQA-M3G 2024 challenge necessitates novel solutions for Multilingual & Multimodal Medical Answer Generation in dermatology (wai Yim et al., 2024a). This paper addresses the limitations of traditional methods by proposing a weakly supervised learning approach for open-ended medical question-answering (QA). Our system leverages readily available MEDIQA-M3G images via a VGG16-CNN-SVM model, enabling multilingual (English, Chinese, Spanish) learning of informative skin condition representations. Using pre-trained QA models, we further bridge the gap between visual and textual information through multimodal fusion. This approach tackles complex, open-ended questions even without predefined answer choices. We empower the generation of comprehensive answers by feeding the ViT-CLIP model with multiple responses alongside images. This work advances medical QA research, paving the way for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHealth Literacy and Information Accessibility · Wikis in Education and Collaboration · Discourse Analysis in Language Studies
