# Racial Disparities in Comorbidity Patterns of Early-Onset Liver Cancer: A Machine Learning Analysis

**Authors:** Bingya Ma, Kai Zheng, Fa-Chyi Lee, Yunxia Lu

PMC · DOI: 10.1177/10732748251363687 · Cancer Control : Journal of the Moffitt Cancer Center · 2025-07-30

## TL;DR

This study uses machine learning to show how different racial groups have distinct health conditions linked to early-onset liver cancer, suggesting ways to better target prevention efforts.

## Contribution

The novel use of race-specific machine learning models to identify comorbidity patterns in early-onset liver cancer across different ethnic groups.

## Key findings

- Asian and Pacific Islanders had higher rates of Hepatitis B virus infection, while Hispanics had higher rates of cirrhosis and metabolic disorders.
- Machine learning models for Asian and Hispanic groups outperformed the model for White patients in predicting liver cancer risk.
- Comorbidity patterns varied significantly by race, with HBV being dominant for API and HCV/metabolic disorders for Hispanics.

## Abstract

The incidence of early-onset liver cancer (EOLC) has been increasing in many countries, yet evidence on its etiology remains limited, particularly outside the Asian population. This case-control study explores the comorbidity patterns of EOLC and develops race/ethnicity-specific machine learning (ML) models to predict liver cancer risk.

We included patients diagnosed with primary liver cancer between ages 18 and 49 from the University of California Health Data Warehouse, matching each patient with five controls. ML classification methods, including decision trees, random forests, logistic regression, XGBoost, and LightGBM, were used to assess liver cancer risk based on demographics and comorbidities. Model performance was evaluated using F1 scores, and SHapley Additive exPlanations (SHAP) was applied to identify the most influential comorbidities within each racial group.

A total of 1574 patients and 7870 controls were identified. Asian and Pacific Islanders (API) had significantly higher rates of Hepatitis B virus (HBV) infection, while Hispanics had higher prevalences of cirrhosis, hypertension, diabetes, and Hepatitis C virus (HCV) infection. Whites showed higher rates of anxiety, asthma, hypothyroidism, and cholangitis. Race/ethnicity-specific models for API (F1 score = 0.77, AUC = 0.90) and Hispanics (F1 score = 0.77, AUC = 0.92) outperformed the model for Whites (F1 score = 0.64, AUC = 0.87) in the validation dataset. The SHAP results indicated that HBV infection was the dominant comorbidity for API, and HCV and metabolic disorders were notable among Hispanics. In contrast, the White population showed a broader and less concentrated comorbidity pattern.

Our study highlights significant racial disparities in comorbidity patterns for early-onset liver cancer, demonstrating the potential of ML models to identify high-risk populations and inform targeted prevention strategies.

## Linked entities

- **Diseases:** Hepatitis B virus infection (MONDO:0005344), cirrhosis (MONDO:0005155), diabetes (MONDO:0005015), Hepatitis C virus infection (MONDO:0005231), anxiety (MONDO:0005618), asthma (MONDO:0004979), hypothyroidism (MONDO:0005420), cholangitis (MONDO:0004789)

## Full-text entities

- **Genes:** ADIPOQ (adiponectin, C1Q and collagen domain containing) [NCBI Gene 9370] {aka ACDC, ACRP30, ADIPQTL1, ADPN, APM-1, APM1}, SHROOM4 (shroom family member 4) [NCBI Gene 57477] {aka MRXSSDS, SHAP, shrm4}
- **Diseases:** chronic obstructive pulmonary disease (MESH:D029424), cirrhosis (MESH:D005355), peptic ulcer (MESH:D010437), infection (MESH:D007239), metabolic disease (MESH:D008659), Crohn's disease (MESH:D003424), Cancer (MESH:D009369), inflammatory (MESH:D007249), chronic hepatitis B (MESH:D019694), hepatitis virus infections (MESH:D006525), heart disease (MESH:D006331), liver and intrahepatic bile duct cancer (MESH:D001650), peripheral vascular disease (MESH:D016491), insulin resistance (MESH:D007333), respiratory or allergic diseases (MESH:D012130), steatosis of the liver (MESH:D005234), metabolic syndrome (MESH:D024821), metastasis (MESH:D009362), anemia (MESH:D000740), depressive disorder (MESH:D003866), autoimmune liver disease (MESH:D008107), liver cirrhosis (MESH:D008103), cataract (MESH:D002386), anxiety (MESH:D001007), cerebrovascular disease (MESH:D002561), hyperlipidemia (MESH:D006949), cardiovascular disease (MESH:D002318), colon polyps (MESH:D003111), obstructive sleep apnea (MESH:D020181), gout (MESH:D006073), inflammatory bowel diseases (MESH:D015212), osteoarthritis (MESH:D010003), congenital heart disease (MESH:D006330), mental health disorders (OMIM:603663), EOLC (MESH:D006528), alcohol (MESH:D000437), Vitamin D deficiency (MESH:D014808), polyp of the large intestine (MESH:D007417), chronic kidney disease (MESH:D051436), substance use disorders (MESH:D019966), allergic rhinitis (MESH:D065631), kidney stone (MESH:D007669), ICC (MESH:D018281), HIV infection (MESH:D015658), hypothyroidism (MESH:D007037), GERD (MESH:D005764), prostatic hyperplasia (MESH:D011470), diabetes (MESH:D003920), thyroid disease (MESH:D013959), HBV infection (MESH:D006509), PSC (MESH:D015209), chronic (MESH:D002908), renal conditions (MESH:D007674), HCV (MESH:D006526), osteoporosis (MESH:D010024), biliary diseases (MESH:D001660), ulcerative colitis (MESH:D003093), mental disorders (MESH:D001523), asthma (MESH:D001249), gastrointestinal diseases (MESH:D005767)
- **Chemicals:** Vitamin D (MESH:D014807), alcohol (MESH:D000438), lipid (MESH:D008055)
- **Species:** Hepatitis C Virus [taxon 11103], Homo sapiens (human, species) [taxon 9606], Hepatitis B virus (no rank) [taxon 10407]
- **Mutations:** AUC of 0, R230 C

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12317173/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12317173/full.md

## References

77 references — full list in the complete paper: https://tomesphere.com/paper/PMC12317173/full.md

---
Source: https://tomesphere.com/paper/PMC12317173