Multimodal system for skin cancer detection
Volodymyr Sydorskyi, Igor Krashenyi, Oleksii Yakubenko

TL;DR
This paper presents a multi-modal melanoma detection system that combines conventional photos and metadata, improving accuracy and accessibility over traditional dermoscopic image-based models.
Contribution
It introduces a versatile, multi-stage neural network architecture that integrates image and metadata data, addressing dataset imbalance and enhancing detection performance.
Findings
Achieved a peak Partial ROC AUC of 0.18068.
Top-15 retrieval sensitivity of 0.78371.
Demonstrated significant performance gains with multi-modal integration.
Abstract
Melanoma detection is vital for early diagnosis and effective treatment. While deep learning models on dermoscopic images have shown promise, they require specialized equipment, limiting their use in broader clinical settings. This study introduces a multi-modal melanoma detection system using conventional photo images, making it more accessible and versatile. Our system integrates image data with tabular metadata, such as patient demographics and lesion characteristics, to improve detection accuracy. It employs a multi-modal neural network combining image and metadata processing and supports a two-step model for cases with or without metadata. A three-stage pipeline further refines predictions by boosting algorithms and enhancing performance. To address the challenges of a highly imbalanced dataset, specific techniques were implemented to ensure robust training. An ablation study…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCutaneous Melanoma Detection and Management · AI in cancer detection · Face recognition and analysis
