COVID-19 Image Data Collection: Prospective Predictions Are the Future
Joseph Paul Cohen, Paul Morrison, Lan Dao, Karsten Roth and, Tim Q Duong, Marzyeh Ghassemi

TL;DR
This paper introduces the first large public COVID-19 chest X-ray dataset, enabling development of predictive tools for patient prognosis and disease progression, with initial use cases demonstrated.
Contribution
It provides the largest publicly available COVID-19 image dataset with detailed metadata, facilitating machine learning research for diagnosis and prognosis.
Findings
Dataset contains hundreds of frontal and lateral X-rays with metadata
Potential use cases include predicting ICU need and patient survival
Dataset supports trajectory analysis during treatment
Abstract
Across the world's coronavirus disease 2019 (COVID-19) hot spots, the need to streamline patient diagnosis and management has become more pressing than ever. As one of the main imaging tools, chest X-rays (CXRs) are common, fast, non-invasive, relatively cheap, and potentially bedside to monitor the progression of the disease. This paper describes the first public COVID-19 image data collection as well as a preliminary exploration of possible use cases for the data. This dataset currently contains hundreds of frontal view X-rays and is the largest public resource for COVID-19 image and prognostic data, making it a necessary resource to develop and evaluate tools to aid in the treatment of COVID-19. It was manually aggregated from publication figures as well as various web based repositories into a machine learning (ML) friendly format with accompanying dataloader code. We collected…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCOVID-19 diagnosis using AI · Artificial Intelligence in Healthcare and Education · Radiomics and Machine Learning in Medical Imaging
