Efficient Large Scale Medical Image Dataset Preparation for Machine Learning Applications
Stefan Denner, Jonas Scherer, Klaus Kades, Dimitrios Bounias, Philipp, Schader, Lisa Kausch, Markus Bujotzek, Andreas Michael Bucher, Tobias, Penzkofer, Klaus Maier-Hein

TL;DR
This paper presents a new data curation tool within the Kaapana toolkit designed to efficiently organize, manage, and validate large-scale medical imaging datasets for machine learning, addressing current limitations in data handling and bias detection.
Contribution
The paper introduces an innovative, open-source data curation tool tailored for medical imaging datasets, enhancing organization, quality control, and bias detection for machine learning applications.
Findings
Improved dataset organization and management functionalities.
Enhanced quality control and validation processes.
Ability to identify dataset biases through metadata analysis.
Abstract
In the rapidly evolving field of medical imaging, machine learning algorithms have become indispensable for enhancing diagnostic accuracy. However, the effectiveness of these algorithms is contingent upon the availability and organization of high-quality medical imaging datasets. Traditional Digital Imaging and Communications in Medicine (DICOM) data management systems are inadequate for handling the scale and complexity of data required to be facilitated in machine learning algorithms. This paper introduces an innovative data curation tool, developed as part of the Kaapana open-source toolkit, aimed at streamlining the organization, management, and processing of large-scale medical imaging datasets. The tool is specifically tailored to meet the needs of radiologists and machine learning researchers. It incorporates advanced search, auto-annotation and efficient tagging functionalities…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiomics and Machine Learning in Medical Imaging · Medical Imaging Techniques and Applications · Advanced X-ray and CT Imaging
