Assessment of Using Synthetic Data in Brain Tumor Segmentation
Aditi Jahagirdar, Sameer Joshi

TL;DR
This paper explores the use of synthetic MRI data generated by GANs to augment training datasets for brain tumor segmentation, showing comparable quantitative results and improved boundary delineation in hybrid datasets.
Contribution
It demonstrates the feasibility of using synthetic data to enhance brain tumor segmentation models and provides insights into optimal data mixing proportions.
Findings
Hybrid datasets with 40% real and 60% synthetic data improve boundary delineation.
Quantitative performance metrics are similar between real-only and hybrid datasets.
Region-wise accuracy for tumor core and enhancing tumor remains lower, indicating class imbalance issues.
Abstract
Manual brain tumor segmentation from MRI scans is challenging due to tumor heterogeneity, scarcity of annotated data, and class imbalance in medical imaging datasets. Synthetic data generated by generative models has the potential to mitigate these issues by improving dataset diversity. This study investigates, as a proof of concept, the impact of incorporating synthetic MRI data, generated using a pre-trained GAN model, into training a U-Net segmentation network. Experiments were conducted using real data from the BraTS 2020 dataset, synthetic data generated with the medigan library, and hybrid datasets combining real and synthetic samples in varying proportions. While overall quantitative performance (Dice coefficient, IoU, precision, recall, accuracy) was comparable between real-only and hybrid-trained models, qualitative inspection suggested that hybrid datasets, particularly with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
