Stratify or Inject: Two Simple Training Strategies to Improve Brain   Tumor Segmentation

Raphael Meier; Michael Rebsamen; Urspeter Knecht; Mauricio Reyes,; Roland Wiest; Richard McKinley

arXiv:1907.12941·eess.IV·July 31, 2019

Stratify or Inject: Two Simple Training Strategies to Improve Brain Tumor Segmentation

Raphael Meier, Michael Rebsamen, Urspeter Knecht, Mauricio Reyes,, Roland Wiest, Richard McKinley

PDF

Open Access

TL;DR

This paper proposes two simple training strategies that incorporate tumor grade information to improve brain tumor segmentation accuracy in deep learning models, demonstrated on the BRATS 2018 dataset.

Contribution

Introduction of two novel training strategies that utilize tumor grade information to enhance brain tumor segmentation performance.

Findings

01

Both strategies improve segmentation accuracy over baseline methods.

02

Incorporating tumor grade reduces heterogeneity impact on model training.

03

Strategies are validated on the BRATS 2018 dataset.

Abstract

Deep learning methods for brain tumor segmentation are typically trained in an ad hoc fashion on all available data. Brain tumors are tremendously heterogeneous in image appearance and labeled training data is limited. We argue that incorporation of additional prior information, specifically tumor grade, associated with tumor imaging phenotypes during model training can significantly improve segmentation performance. Two strategies for incorporation of tumor grade during model training are proposed and their impact on segmentation performance is demonstrated on the BRATS 2018 dataset.

Tables1

Table 1. Table 1: Ratio in % of better performing subjects compared to baseline. p-values from a one-sided Wilcoxon signed rank test. Bold numbers indicate statistically significant ( p < 0.05 𝑝 0.05 p<0.05 ) results. (CE = contrast-enhancing tumor)

	CE	Core	Tumor
LGG vs. Baseline	41.7 (p=0.877)	49.3 (p=0.454)	54.7 (p=0.208)
HGG vs. Baseline	58.4 (p=0.005)	70.3 (p=5.659e-09)	46.7 (p=0.877)
HGG/LGG vs. Baseline	54.6 (p=0.127)	64.9 (p=1.441e-05)	48.8 (p=0.725)
Type-aware vs. Baseline	53.4 (p=0.321)	53.9 (p=0.028)	52.6 (p=0.231)

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBrain Tumor Detection and Classification · Advanced Neural Network Applications · Radiomics and Machine Learning in Medical Imaging

Full text

\jmlrproceedings

Medical Imaging with Deep LearningMedical Imaging with Deep Learning \jmlrpages \jmlryear2019 \jmlrworkshopExtended Abstract – MIDL 2019 Submission

\midlauthor\NameRaphael Meier\nametag1 \[email protected]

\NameMichael Rebsamen\nametag1 \[email protected]

\NameUrspeter Knecht\nametag2 \[email protected]

\NameMauricio Reyes\nametag2,3 \[email protected]

\NameRoland Wiest\nametag1 \[email protected]

\NameRichard McKinley\nametag1 \[email protected]

\addr1 Support Center for Advanced Neuroimaging (SCAN), University Institute of Diagnostic and Interventional Neuroradiology, University of Bern, Inselspital, Bern University Hospital, Bern, Switzerland

\addr2 Institute for Surgical Technology and Biomechanics, University of Bern, Bern, Switzerland and \addr3 Healthcare Imaging A.I. Lab, Insel Data Science Center, Inselspital, Bern University Hospital, Bern, Switzerland and

Stratify or Inject: Two Simple Training Strategies to Improve Brain Tumor Segmentation

Abstract

Deep learning methods for brain tumor segmentation are typically trained in an ad hoc fashion on all available data. Brain tumors are tremendously heterogeneous in image appearance and labeled training data is limited. We argue that incorporation of additional prior information, specifically tumor grade, associated with tumor imaging phenotypes during model training can significantly improve segmentation performance. Two strategies for incorporation of tumor grade during model training are proposed and their impact on segmentation performance is demonstrated on the BRATS 2018 dataset.

††editors: Accepted for MIDL 2019

1 Introduction

The segmentation of brain tumors has been a long standing problem in medical image analysis. Research on this topic has been accelerated greatly through the availability of public datasets such as the Brain Tumor Segmentation (BRATS) Challenge dataset [Menze et al.(2015)Menze, Jakab, and et al., Bakas et al.(2017a)Bakas, Akbari, and et al., Bakas et al.(2017b)Bakas, Akbari, and et al., Bakas et al.(2017c)Bakas, Akbari, and et al.]. Currently, the best performing methods for brain tumor segmentation are based on deep learning [Bakas et al.(2018)Bakas, Reyes, and et al.], with first approaches being applied in clinically critical areas such as tumor response assessment [Kickingereder et al.(2019)Kickingereder, Isensee, and et al.] or radiation therapy planning [Jungo et al.(2018)Jungo, Meier, and et al.]. A general view in deep learning is that more training data yields better generalization performance. For tasks in computer vision it was shown that model performance increases logarithmically based on volume of training data [Sun et al.(2017)Sun, Shrivastava, and et al.]. Consequently, deep learning segmentation models are trained on all available data, often neglecting peculiarities of the data at hand.

The BRATS Challenge has been concerned so far with the segmentation of glioma, which are primary tumors of the central nervous system. Glioma can be classified into different tumor grades based on the underlying molecular characteristics and histology [Louis et al.(2016)Louis, Perry, and et al.]. A higher grade reflects increasing malignancy and glioma are commonly grouped into high-grade (grade III/IV) and low-grade glioma (grade I/II). Furthermore, they exhibit a tremendous genetic and molecular heterogeneity which spans across tumor grades but also manifests itself within a particular type such as glioblastoma (grade IV) [Verhaak et al.(2010)Verhaak, Hoadley, and et al., Sottoriva et al.(2013)Sottoriva, Spiteri, and et al.]. The underlying biological configuration of a tumor has been associated with distinctively different tumor imaging phenotypes [Grossmann et al.(2016)Grossmann, Gutman, and et al.]. In general, low-grade glioma present much less or no contrast-enhancement compared to high-grade glioma [Forst et al.(2014)Forst, Nahed, and et al]. Deep learning methods are confronted with the challenge to successfully generalize across these different imaging phenotypes.

We hypothesize that brain tumor segmentation performance of deep learning methods can be improved by utilizing prior information associated with tumor imaging phenotypes during model training. Thus, we propose two simple training strategies targeted at tumor grade and evaluate their effectiveness on the BRATS 2018 dataset using a recently proposed, top-ranked method [McKinley et al.(2019)McKinley, Meier, and Wiest]. This work is part of a more extensive study currently being submitted to a journal.

2 Methods

Model architecture. The deep learning method corresponds to a shallow U-Net style model of down and upsampling connections featuring densely connected blocks of dilated convolutions. For more details on the model architecture we refer to [McKinley et al.(2019)McKinley, Meier, and Wiest].

Incorporation of tumor grade. We propose two strategies to utilize information on tumor grade at the stage of model training. The first strategy consists of stratifying the training data into high-grade glioma (HGG) and low-grade glioma (LGG) cases and training two separate models. During testing the respective model is applied to testing data with corresponding tumor grade. As a second strategy, we propose an injection of the tumor grade. In addition to feeding the model with imaging data consisting of the co-registered Magnetic Resonance (MR) sequences, we provide it with a binary input indicating if the case at hand is either a LGG or HGG. The tumor type is injected as an image volume, with dimensions identical to the MR sequences and all voxel values set either to zeros or ones. The model is then trained on this enlarged dataset (type-aware network). For both strategies, the model architecture and hyperparameter setting remains unchanged.

In the following, we utilized training data of the BRATS 2018 Challenge, which includes 75 patients with LGGs and 210 patients with HGGs. We did not include the testing data in the analysis since tumor grade is blinded for those cases. A detailed description of the imaging data can be found in [Bakas et al.(2018)Bakas, Reyes, and et al.]. Four different models were trained using five-fold cross-validation: i) a baseline model using all available cases (N=285), ii) a model trained only on HGG data (N=210), iii) a model trained only on LGG data (N=75), and iv) the type-aware network trained on all cases (N=285).

3 Results & Conclusion

The segmentation performance of the four different deep learning models in terms of fraction of cases with improved Dice coefficient is shown in Table 1. If we look at the HGG data alone, the HGG model yields a significantly improved performance over the baseline model in 58.4% of the cases for contrast-enhancing tumor and 70.3% of the cases for the tumor core segmentation. Looking at all the data including also LGG, we see a significant improvement for the tumor core segmentation in 64.9% of the cases when using the two models trained on stratified data compared to the baseline model trained on all available data. Finally, the type-aware network yields a minor but significant improvement for the segmentation of the tumor core in 53.9 % of the cases. Figure LABEL:fig:epochs shows that observed improvements are manifested relatively consistent across all epochs of model training.

We have proposed two different strategies on incorporating information of tumor grade during model training, which are straightforward to apply, and demonstrated their effectiveness on the BRATS 2018 dataset. While data stratification yields clear improvements, more advanced network architectures incorporating prior information about tumor grade beyond injecting it as an additional input should be investigated further. In addition, our strategies could also be used in conjunction with networks for tumor typing.

\midlacknowledgments

This work was supported by the Swiss National Foundation, grant number 169607.

Bibliography14

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[Bakas et al.(2017 a)Bakas, Akbari, and et al.] Spyridon Bakas, Hamed Akbari, and et al. Advancing the cancer genome atlas glioma MRI collections with expert segmentation labels and radiomic features. Scientific Data , 4:170117, September 2017 a. 10.1038/sdata.2017.117 . URL https://doi.org/10.1038/sdata.2017.117 . · doi ↗
2[Bakas et al.(2017 b)Bakas, Akbari, and et al.] Spyridon Bakas, Hamed Akbari, and et al. Segmentation labels for the pre-operative scans of the tcga-gbm collection, 2017 b. URL https://wiki.cancerimagingarchive.net/x/Ko Zy AQ .
3[Bakas et al.(2017 c)Bakas, Akbari, and et al.] Spyridon Bakas, Hamed Akbari, and et al. Segmentation labels for the pre-operative scans of the tcga-lgg collection, 2017 c. URL https://wiki.cancerimagingarchive.net/x/LI Zy AQ .
4[Bakas et al.(2018)Bakas, Reyes, and et al.] Spyridon Bakas, Mauricio Reyes, and et al. Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the BRATS challenge. Co RR , abs/1811.02629, 2018. URL http://arxiv.org/abs/1811.02629 .
5[Forst et al.(2014)Forst, Nahed, and et al] D. A. Forst, B. V. Nahed, and et al. Low-grade gliomas. The Oncologist , 19(4):403–413, mar 2014. 10.1634/theoncologist.2013-0345 . URL https://doi.org/10.1634/theoncologist.2013-0345 . · doi ↗
6[Grossmann et al.(2016)Grossmann, Gutman, and et al.] Patrick Grossmann, David A. Gutman, and et al. Imaging-genomics reveals driving pathways of mri derived volumetric tumor phenotype features in glioblastoma. BMC Cancer , 16(1):611, Aug 2016. ISSN 1471-2407. 10.1186/s 12885-016-2659-5 . URL https://doi.org/10.1186/s 12885-016-2659-5 . · doi ↗
7[Jungo et al.(2018)Jungo, Meier, and et al.] Alain Jungo, Raphael Meier, and et al. Uncertainty-driven sanity check: Application to postoperative brain tumor cavity segmentation. Co RR , abs/1806.03106, 2018. URL http://arxiv.org/abs/1806.03106 .
8[Kickingereder et al.(2019)Kickingereder, Isensee, and et al.] Philipp Kickingereder, Fabian Isensee, and et al. Automated quantitative tumour response assessment of MRI in neuro-oncology with artificial neural networks: a multicentre, retrospective study. The Lancet Oncology , apr 2019. 10.1016/s 1470-2045(19)30098-1 . URL https://doi.org/10.1016/s 1470-2045(19)30098-1 . · doi ↗