Natural vs Balanced Distribution in Deep Learning on Whole Slide Images   for Cancer Detection

Ismat Ara Reshma; Sylvain Cussat-Blanc; Radu Tudor Ionescu; Herv\'e; Luga; Josiane Mothe

arXiv:2012.11684·cs.CV·December 23, 2020

Natural vs Balanced Distribution in Deep Learning on Whole Slide Images for Cancer Detection

Ismat Ara Reshma, Sylvain Cussat-Blanc, Radu Tudor Ionescu, Herv\'e, Luga, Josiane Mothe

PDF

TL;DR

This study compares natural and balanced class distributions in deep learning models trained on whole slide images for cancer detection, finding that natural distributions yield fewer false positives with comparable false negatives.

Contribution

It provides an empirical analysis demonstrating that using the natural class distribution in WSIs improves model performance over artificially balanced datasets.

Findings

01

Natural distribution results in fewer false positives.

02

Comparable false negatives between distributions.

03

Natural distribution outperforms balanced in all metrics.

Abstract

The class distribution of data is one of the factors that regulates the performance of machine learning models. However, investigations on the impact of different distributions available in the literature are very few, sometimes absent for domain-specific tasks. In this paper, we analyze the impact of natural and balanced distributions of the training set in deep learning (DL) models applied on histological images, also known as whole slide images (WSIs). WSIs are considered as the gold standard for cancer diagnosis. In recent years, researchers have turned their attention to DL models to automate and accelerate the diagnosis process. In the training of such DL models, filtering out the non-regions-of-interest from the WSIs and adopting an artificial distribution (usually, a balanced distribution) is a common trend. In our analysis, we show that keeping the WSIs data in their usual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.