Parallel-beam X-ray CT datasets of apples with internal defects and label balancing for machine learning
Sophia Bethany Coban, Vladyslav Andriiashen, Poulami Somanya, Ganguly, Maureen van Eijnatten, Kees Joost Batenburg

TL;DR
This paper introduces three comprehensive X-ray CT datasets of apples with internal defects, designed for developing and testing machine learning methods in image reconstruction, segmentation, and defect detection, while addressing label bias issues.
Contribution
The paper provides real 3D X-ray CT datasets with defect labels and proposes a novel optimization approach to eliminate label bias for machine learning applications.
Findings
Datasets include noiseless and noisy simulations based on real data.
A new method for eliminating label bias in datasets is demonstrated.
Datasets support various tasks like reconstruction, segmentation, and defect detection.
Abstract
We present three parallel-beam tomographic datasets of 94 apples with internal defects along with defect label files. The datasets are prepared for development and testing of data-driven, learning-based image reconstruction, segmentation and post-processing methods. The three versions are a noiseless simulation; simulation with added Gaussian noise, and with scattering noise. The datasets are based on real 3D X-ray CT data and their subsequent volume reconstructions. The ground truth images, based on the volume reconstructions, are also available through this project. Apples contain various defects, which naturally introduce a label bias. We tackle this by formulating the bias as an optimization problem. In addition, we demonstrate solving this problem with two methods: a simple heuristic algorithm and through mixed integer quadratic programming. This ensures the datasets can be split…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpectroscopy and Chemometric Analyses · Smart Agriculture and AI · Metabolomics and Mass Spectrometry Studies
