# Transferability of Deep Learning Algorithms for Malignancy Detection in   Confocal Laser Endomicroscopy Images from Different Anatomical Locations of   the Upper Gastrointestinal Tract

**Authors:** Marc Aubreville, Miguel Goncalves, Christian Knipfer, Nicolai Oetter,, Helmut Neumann, Florian Stelzle, Christopher Bohr, Andreas Maier

arXiv: 1902.08985 · 2020-01-06

## TL;DR

This study compares transfer learning methods for automatic malignancy detection in confocal laser endomicroscopy images from different upper gastrointestinal locations, introducing a novel image-level classification approach that enhances accuracy and generalization.

## Contribution

A new image-level classification method based on pre-trained Inception V.3 improves SCC detection in CLE images and demonstrates strong transferability across different anatomical sites.

## Key findings

- Achieved over 91% accuracy on oral cavity data
- Maintained similar ROC AUC when transferring to vocal fold data
- Proposed method outperforms previous patch-based approaches

## Abstract

Squamous Cell Carcinoma (SCC) is the most common cancer type of the epithelium and is often detected at a late stage. Besides invasive diagnosis of SCC by means of biopsy and histo-pathologic assessment, Confocal Laser Endomicroscopy (CLE) has emerged as noninvasive method that was successfully used to diagnose SCC in vivo. For interpretation of CLE images, however, extensive training is required, which limits its applicability and use in clinical practice of the method. To aid diagnosis of SCC in a broader scope, automatic detection methods have been proposed. This work compares two methods with regard to their applicability in a transfer learning sense, i.e. training on one tissue type (from one clinical team) and applying the learnt classification system to another entity (different anatomy, different clinical team). Besides a previously proposed, patch-based method based on convolutional neural networks, a novel classification method on image level (based on a pre-trained Inception V.3 network with dedicated preprocessing and interpretation of class activation maps) is proposed and evaluated. The newly presented approach improves recognition performance, yielding accuracies of 91.63% on the first data set (oral cavity) and 92.63% on a joint data set. The generalization from oral cavity to the second data set (vocal folds) lead to similar area-under-the-ROC curve values than a direct training on the vocal folds data set, indicating good generalization.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.08985/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/1902.08985/full.md

## References

31 references — full list in the complete paper: https://tomesphere.com/paper/1902.08985/full.md

---
Source: https://tomesphere.com/paper/1902.08985