PediCXR: An open, large-scale chest radiograph dataset for interpretation of common thoracic diseases in children
Hieu H. Pham, Ngoc H. Nguyen, Thanh T. Tran, Tuan N.M. Nguyen, and Ha, Q. Nguyen

TL;DR
PediCXR is the first large-scale pediatric chest X-ray dataset with detailed annotations, designed to facilitate the development of diagnostic models for thoracic diseases in children.
Contribution
This paper introduces PediCXR, a comprehensive, annotated pediatric CXR dataset, filling a critical gap for data-driven pediatric disease diagnosis research.
Findings
Largest pediatric CXR dataset with lesion-level annotations
Includes 36 findings and 15 diseases with bounding boxes
Publicly available for research and development
Abstract
The development of diagnostic models for detecting and diagnosing pediatric diseases in CXR scans is undertaken due to the lack of high-quality physician-annotated datasets. To overcome this challenge, we introduce and release PediCXR, a new pediatric CXR dataset of 9,125 studies retrospectively collected from a major pediatric hospital in Vietnam between 2020 and 2021. Each scan was manually annotated by a pediatric radiologist with more than ten years of experience. The dataset was labeled for the presence of 36 critical findings and 15 diseases. In particular, each abnormal finding was identified via a rectangle bounding box on the image. To the best of our knowledge, this is the first and largest pediatric CXR dataset containing lesion-level annotations and image-level labels for the detection of multiple findings and diseases. For algorithm development, the dataset was divided into…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCOVID-19 diagnosis using AI · Lung Cancer Diagnosis and Treatment · Radiomics and Machine Learning in Medical Imaging
