# Curated and harmonised transcriptomics datasets of interstitial lung diseases

**Authors:** Simo Inkala, Antonio Federico, Angela Serra, Dario Greco

PMC · DOI: 10.1016/j.dib.2025.112139 · 2025-10-14

## TL;DR

This study creates a standardized collection of lung disease gene expression data to improve research and treatment development.

## Contribution

The novel contribution is a curated, harmonized transcriptomics dataset for interstitial lung diseases with standardized metadata and gene expression comparisons.

## Key findings

- A compendium of 30 transcriptomics datasets (1371 samples) was curated and harmonized for interstitial lung diseases.
- Differentially expressed genes between ILD and healthy samples were identified and provided.
- Co-expression networks for IPF and healthy samples were inferred and included in the dataset.

## Abstract

This study provides manually curated and homogenised transcriptomics data of interstitial lung disease (ILD) patients retrieved from the NCBI Gene Expression Omnibus and European Nucleotide Archive repositories. The compendium includes 30 transcriptomics datasets generated with DNA microarrays and RNA sequencing (RNA-seq) technologies for a total of 1371 samples. All the datasets underwent metadata curation and harmonisation, data quality check, and preprocessing with standardised procedures. Furthermore, a robust data model was developed to standardise phenotypic data, thereby enhancing comparability across heterogeneous datasets. Gene expression data and lists of differentially expressed genes computed between ILD and healthy samples are provided. Among the ILDs included in this study, idiopathic pulmonary fibrosis (IPF) is the most represented worldwide. Co-expression networks of IPF and healthy samples were inferred, which are also included in this study. This study enhances the Findability, Accessibility, Interoperability, and Reusability (FAIR) of publicly available transcriptomic datasets related to ILDs. The resulting resource provides a integrated platform for the implementation and validation of systems biology and pharmacology approaches, facilitating the development of novel diagnostic and therapeutic strategies for ILDs.

## Linked entities

- **Diseases:** interstitial lung disease (MONDO:0015925), idiopathic pulmonary fibrosis (MONDO:0800029)

## Full-text entities

- **Diseases:** IPF (MESH:D054990), ILD (MESH:D017563)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12581653/full.md

---
Source: https://tomesphere.com/paper/PMC12581653