# Radiomics dataset from chest CT of clinically healthy adults

**Authors:** Viktoria Bedei, Mykola Ostrovskyy, Nilanjan Dey, Taras Kotyk, R. Simon Sherratt

PMC · DOI: 10.1016/j.dib.2026.112556 · 2026-02-09

## TL;DR

This paper introduces a dataset of lung radiomic features from CT scans of 100 healthy adults, which can be used as a reference for lung disease studies.

## Contribution

The dataset provides standardized radiomic features from healthy lungs, enabling benchmarking and comparative modeling in lung diseases.

## Key findings

- The dataset includes 107 radiomic features extracted from 8 ROIs per subject using a uniform CT protocol.
- It serves as a normative reference for lung radiomics and can be used for harmonization and robustness studies.
- The dataset supports comparative modeling in diseases like emphysema and COPD.

## Abstract

This data note describes a structured dataset of lung radiomic features derived from thoracic noncontrast computed tomography examinations of 100 subjects (47 males, 53 females; aged 15–74 years). Participants were selected on the basis of the absence of known lung, pleura, and mediastinum diseases in clinical records and radiology reports, as well as systemic diseases affecting the respiratory system. The included computed tomography studies were performed on a single multidetector CT scanner (Siemens Healthineers SOMATOM go. Now), using a uniform protocol (110 kVp; reconstructed slice thickness 0.8 mm; Br60-type lung kernel).

For each case, the target thin-slice DICOM series was converted to the NIfTI format. The lung lobes (“raw” masks), vessels and air pathways were segmented automatically with TotalSegmentator. In addition to “raw” lobe masks, vessel/airway-subtracted (parenchyma) masks were generated. Lobe masks (left lung – 2, right lung – 3) were also combined into the left lung, right lung, and both lungs, resulting in eight ROIs per subject for each mask type – with (“raw”) and without vessel/air pathways (“parenchyma”).

For each SubjectID×ROI, radiomic features (107 “original” features – shape, first-order, and texture families) were extracted via a PyRadiomics-based pipeline with fixed settings (B-spline interpolation; resampling to 1 × 1 × 1 mm; bin width 25 HU; absolute resegmentation) in two attenuation ranges: −1000 to +200 HU and −950 to 0 HU. The dataset is distributed as (i) a CT protocol table, (ii-iii) two feature tables (“raw” and parenchyma masks), (iv) a JSON file with a computational environment description, (v) a Python extraction script, and (vi) a dictionary file.

This dataset can serve as a normative reference for lung radiomics, a benchmark for harmonization and robustness studies, and a control cohort for comparative modelling in diffuse lung diseases, as well as region-specific diseases requiring lobar or single-lung-specific radiomic features (such as emphysema, chronic obstructive pulmonary disease, Swyer–James–MacLeod syndrome, asbestosis, and silicosis).

Image, graphical abstract

## Linked entities

- **Diseases:** emphysema (MONDO:0004849), chronic obstructive pulmonary disease (MONDO:0005002), Swyer–James–MacLeod syndrome (MONDO:0800120), asbestosis (MONDO:0016466), silicosis (MONDO:0005960)

## Full-text entities

- **Diseases:** diseases (MESH:D004194), nodules (MESH:D016606), cancer (MESH:D009369), lung diseases (MESH:D008171), thoracic abnormalities (MESH:D013896), abnormalities in the lung parenchyma (MESH:D010195), emphysema (MESH:D004646), mediastinum diseases (MESH:D008479), COPD (MESH:D029424), Swyer-James-MacLeod syndrome (MESH:D019568), COVID-19 (MESH:D000086382), sarcoidosis (MESH:D012507), acquired pneumonia (MESH:D000077299), embolism (MESH:D004617), asbestosis (MESH:D001195), interstitial lung disease (MESH:D017563), tuberculosis (MESH:D014376), oncology (MESH:D000072716), community (MESH:D003147), silicosis (MESH:D012829)
- **Chemicals:** dcm2niix (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12925589/full.md

---
Source: https://tomesphere.com/paper/PMC12925589