Visual Bias and Interpretability in Deep Learning for Dermatological Image Analysis
Enam Ahmed Taufik, Abdullah Khondoker, Antara Firoz Parsa, Seraj Al Mahmud Mostafa

TL;DR
This paper evaluates various deep learning models and image pre-processing techniques for skin disease classification, demonstrating that transformer-based models like DinoV2 with RGB pre-processing yield high accuracy and interpretability in dermatological image analysis.
Contribution
It systematically benchmarks pre-processing methods and deep learning architectures, highlighting the effectiveness of transformer models and RGB pre-processing for skin disease classification.
Findings
DinoV2 with RGB achieves up to 93% accuracy.
Grad-CAM visualizations improve interpretability.
Pre-processing significantly impacts model performance.
Abstract
Accurate skin disease classification is a critical yet challenging task due to high inter-class similarity, intra-class variability, and complex lesion textures. While deep learning-based computer-aided diagnosis (CAD) systems have shown promise in automating dermatological assessments, their performance is highly dependent on image pre-processing and model architecture. This study proposes a deep learning framework for multi-class skin disease classification, systematically evaluating three image pre-processing techniques: standard RGB, CMY color space transformation, and Contrast Limited Adaptive Histogram Equalization (CLAHE). We benchmark the performance of pre-trained convolutional neural networks (DenseNet201, Efficient-NetB5) and transformer-based models (ViT, Swin Transformer, DinoV2 Large) using accuracy and F1-score as evaluation metrics. Results show that DinoV2 with RGB…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
