Handling Class Imbalance Problem in Skin Lesion Classification: Finding Strengths and Weaknesses of Various Balancing Techniques
Ariful Islam Khandaker, Abdullah Al Shafi, Mohiuddin Ahmad

TL;DR
This paper reviews various class balancing techniques for skin lesion classification, demonstrating how hybrid methods like SMOTE+Tomek Links improve model robustness and generalization in imbalanced datasets.
Contribution
It provides an exhaustive comparison of balancing methods applied to skin lesion datasets, highlighting the strengths and weaknesses of each approach in deep learning models.
Findings
Over-sampling improves accuracy but risks overfitting.
Hybrid methods like SMOTE+TL enhance model generalization.
Choosing appropriate balancing techniques is crucial for medical diagnostics.
Abstract
Automatic skin lesion classification from dermoscopy images is important for the early diagnosis of skin diseases such as melanoma. Class imbalance in skin lesion datasets, notably the defects in the representation of malignant(cancerous) cases, is one of the difficulties for deep learning models' performances and generalizations. This paper offers an exhaustive review of some of the balancing methods that aim to address class imbalances using the example of the ISIC 2016 dataset. A light-weight CNN model, MobileNetV2, was combined with under-sampling, over-sampling, and hybrid balancing methods such as Tomek Links(TL), SMOTE, and SMOTE with TL. Over-sampling methods like SMOTE and ADASYN improve performance but may lead to overfitting due to redundant synthetic samples. Hybrid methods like SMOTE+TL counter this drawback by removing noisy or boundary samples so that model generalization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCutaneous Melanoma Detection and Management · Imbalanced Data Classification Techniques · Digital Imaging for Blood Diseases
