# TILDA-X: Transcriptome-Informed Lung Cancer Disparities via Explainable AI

**Authors:** Masrur Sobhan, Md Mezbahul Islam, Mary Jo Trepka, Gregory E. Holt, Charles J. Dimitroff, Ananda M. Mondal

PMC · DOI: 10.3390/cancers17213454 · Cancers · 2025-10-28

## TL;DR

This paper introduces TILDA-X, a new AI framework that improves understanding of lung cancer disparities by using disease conditions instead of race for classification, leading to more accurate and biologically valid results.

## Contribution

TILDA-X is a novel explainable AI framework that uses disease conditions for classification to reduce bias and identify disparity biomarkers in lung cancer.

## Key findings

- Classification based on disease conditions achieved 88-100% accuracy for minority groups, compared to 0-16% for race-based classification.
- Over 63% of identified pathways overlapped with previously reported lung cancer-related studies, supporting biological validity.
- The framework reveals unique biological pathways linked to disparities in different racial and sex groups.

## Abstract

This research addresses the challenge of unequal outcomes in lung cancer across racial and sex groups. Traditional approaches that classify patients by race often produce biased results due to imbalanced data. To overcome this, the authors created a new framework called TILDA-X, which uses disease conditions instead of race for classification and employs explainable artificial intelligence to identify meaningful biomarkers. By discovering individual patient biomarkers first and then building up to group-level patterns, the method uncovers unique biological pathways linked to disparities between different racial and sex groups. The results show that this approach is far more accurate and biologically valid than race-based classification. These findings provide a robust and more reliable way to understand lung cancer differences, supporting the development of precision medicine that can benefit diverse patient populations.

Background: Lung cancer is a leading cause of cancer-related mortality, with disparities in incidence and outcomes observed across different racial and sex groups. Identifying both patient-specific and cohort-specific disparity biomarkers is critical for developing targeted treatments. The lung cancer dataset is highly imbalanced across races, leading to biased results in disparity information if classification is based on race. Method: This study developed an explainable artificial intelligence-based framework, TILDA-X, which designs classification models based on disease conditions instead of races to mitigate racial imbalance in the dataset and applies explainable AI to delineate patient-specific disparity information. A lung cancer transcriptome dataset with three disease conditions—lung adenocarcinoma, lung squamous cell carcinoma, and healthy samples—was used to develop classification models. Applying a bottom-up approach from patient-specific disparity information, the cohort-specific disparity information is discovered for different racial and sex groups, African American males, European American males, African American females, and European American females. Results: Classification based on disease conditions achieved accuracy between 88% and 100% for minority groups (African American males and females), whereas it was only between 0% and 16% for race-based classification, which underscores the significance of the proposed approach. Functional analysis of sub-cohort-specific biomarker genes revealed unique pathways associated with lung cancers in different races and sexes. Among the significant pathways identified, over ~63% overlapped with previously reported lung cancer-related studies, supporting the biological validity of our findings. Overall, combining disease conditions-based classification with explainable AI, this study provides a robust, interpretable framework for characterizing race- and sex-specific disparities in lung cancer, offering a foundation for precision oncology and equitable therapeutic development based on transcriptome profile only.

## Linked entities

- **Diseases:** lung cancer (MONDO:0005138), lung adenocarcinoma (MONDO:0005061), lung squamous cell carcinoma (MONDO:0005097)

## Full-text entities

- **Diseases:** lung squamous cell carcinoma (MESH:D002294), Lung Cancer (MESH:D008175), cancer (MESH:D009369), lung adenocarcinoma (MESH:D000077192)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12607860/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12607860/full.md

## References

83 references — full list in the complete paper: https://tomesphere.com/paper/PMC12607860/full.md

---
Source: https://tomesphere.com/paper/PMC12607860