COVIDx CT-3: A Large-scale, Multinational, Open-Source Benchmark Dataset for Computer-aided COVID-19 Screening from Chest CT Images
Hayden Gunraj, Tia Tuinstra, Alexander Wong

TL;DR
COVIDx CT-3 is the largest and most diverse open-source dataset of chest CT images for COVID-19 detection, supporting machine learning development despite existing biases.
Contribution
This paper introduces COVIDx CT-3, a large-scale, multinational benchmark dataset for COVID-19 detection from chest CT images, addressing data limitations in prior studies.
Findings
COVIDx CT-3 contains 431,205 CT slices from 6,068 patients across 17 countries.
The dataset exhibits significant geographic and class imbalances.
COVIDx CT-3 is the largest open-access COVID-19 CT dataset to date.
Abstract
Computed tomography (CT) has been widely explored as a COVID-19 screening and assessment tool to complement RT-PCR testing. To assist radiologists with CT-based COVID-19 screening, a number of computer-aided systems have been proposed. However, many proposed systems are built using CT data which is limited in both quantity and diversity. Motivated to support efforts in the development of machine learning-driven screening systems, we introduce COVIDx CT-3, a large-scale multinational benchmark dataset for detection of COVID-19 cases from chest CT images. COVIDx CT-3 includes 431,205 CT slices from 6,068 patients across at least 17 countries, which to the best of our knowledge represents the largest, most diverse dataset of COVID-19 CT images in open-access form. Additionally, we examine the data diversity and potential biases of the COVIDx CT-3 dataset, finding that significant…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCOVID-19 diagnosis using AI · Radiomics and Machine Learning in Medical Imaging · Machine Learning in Healthcare
