Performance Comparison of Balanced and Unbalanced Cancer Datasets using Pre-Trained Convolutional Neural Network
Ali Narin

TL;DR
This study compares the performance of balanced and unbalanced datasets in detecting breast cancer tumors using pre-trained InceptionV3 CNN, showing that balanced data improves overall detection accuracy and reliability.
Contribution
It provides an empirical analysis of how dataset balancing affects CNN performance in histopathological breast cancer classification.
Findings
Balanced data yields higher accuracy (93.55%)
Unbalanced data results in lower recall (82.89%)
Balanced datasets improve detection of benign and malignant tumors
Abstract
Cancer disease is one of the leading causes of death all over the world. Breast cancer, which is a common cancer disease especially in women, is quite common. The most important tool used for early detection of this cancer type, which requires a long process to establish a definitive diagnosis, is histopathological images taken by biopsy. These obtained images are examined by pathologists and a definitive diagnosis is made. It is quite common to detect this process with the help of a computer. Detection of benign or malignant tumors, especially by using data with different magnification rates, takes place in the literature. In this study, two different balanced and unbalanced study groups have been formed by using the histopathological data in the BreakHis data set. We have examined how the performances of balanced and unbalanced data sets change in detecting tumor type. In conclusion,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection · Digital Imaging for Blood Diseases · Radiomics and Machine Learning in Medical Imaging
MethodsConvolution
