Improving Radiography Machine Learning Workflows via Metadata Management   for Training Data Selection

Mirabel Reid; Christine Sweeney; Oleg Korobkin

arXiv:2408.12655·cs.LG·August 26, 2024

Improving Radiography Machine Learning Workflows via Metadata Management for Training Data Selection

Mirabel Reid, Christine Sweeney, Oleg Korobkin

PDF

Open Access

TL;DR

This paper presents a metadata management tool for radiography machine learning workflows that enhances data selection, reduces redundancy, and improves reproducibility in scientific research pipelines.

Contribution

The paper introduces a novel metadata management tool tailored for dynamic radiography, demonstrating its effectiveness and discussing potential extensions to broader scientific machine learning workflows.

Findings

01

Improved data selection efficiency in radiography ML workflows

02

Enhanced reproducibility of machine learning experiments

03

Potential for reducing redundant work in scientific research pipelines

Abstract

Most machine learning models require many iterations of hyper-parameter tuning, feature engineering, and debugging to produce effective results. As machine learning models become more complicated, this pipeline becomes more difficult to manage effectively. In the physical sciences, there is an ever-increasing pool of metadata that is generated by the scientific research cycle. Tracking this metadata can reduce redundant work, improve reproducibility, and aid in the feature and training dataset engineering process. In this case study, we present a tool for machine learning metadata management in dynamic radiography. We evaluate the efficacy of this tool against the initial research workflow and discuss extensions to general machine learning pipelines in the physical sciences.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRadiomics and Machine Learning in Medical Imaging · Digital Radiography and Breast Imaging · Medical Imaging Techniques and Applications