Heterogeneous Colorectal Cancer Risk in Women with Metabolic Dysfunction-Associated Steatotic Liver Disease by Age, Lipid, and Waist-Circumference: A Nationwide Cohort Study
Chang Ik Yoon, Hye Sun Lee, Soyoung Jeon, Jin Ah Lee, Dooreh Kim, Jong Min Lee

TL;DR
This study shows that liver disease linked to metabolic issues increases colorectal cancer risk in Korean women, especially younger and non-obese individuals.
Contribution
The study identifies MASLD as a risk marker for colorectal cancer in metabolically healthy-appearing women.
Findings
MASLD increases colorectal cancer risk in Korean women, particularly those aged 40–49 years.
Women with MASLD and waist < 85 cm or without dyslipidemia show a higher risk of colorectal cancer.
MASLD is associated with a 10% increased risk of colorectal cancer after adjusting for multiple factors.
Abstract
Using a nationwide screening cohort of 483,401 Korean women, we found that metabolic dysfunction-associated steatotic liver disease (MASLD) is associated with an increased risk of incident colorectal cancer. Excess risk was pronounced in women aged 40–49 years, those without dyslipidemia, and those with waist < 85 cm. These findings suggest that MASLD could serve as an important marker for risk stratification, even in individuals who may appear metabolically healthy by conventional standards, thereby helping to identify women who might benefit from closer clinical attention and metabolic management. Background: Metabolic dysfunction-associated steatotic liver disease (MASLD) is increasingly common and linked to obesity; however, its association with colorectal cancer (CRC) risk in women remains unclear. Materials and Methods: This retrospective cohort study used the Korean National…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3- —Department of Surgery at Yonsei University College of Medicine
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLiver Disease Diagnosis and Treatment · Alcohol Consumption and Health Effects · Liver Diseases and Immunity
1. Introduction
Metabolic dysfunction-associated steatotic liver disease (MASLD), the recently adopted term for non-alcoholic fatty liver disease (NAFLD), has emerged as a major public health issue due to its strong association with metabolic disorders such as type 2 diabetes mellitus and dyslipidemia [1]. This significant burden is evidenced by a high global prevalence ranging from 13.4% to 30.4% across various regions, while recent nationwide data in South Korea estimates the prevalence at approximately 27.5% [2,3].
Colorectal cancer (CRC) is the third most frequently diagnosed cancer and the second leading cause of cancer mortality worldwide [4], sharing several metabolic risk factors with MASLD. Recent large-scale cohort studies and meta-analyses consistently reported an increased risk of CRC in patients with NAFLD or MASLD [5,6,7]. A Korean nationwide cohort study involving over 2 million individuals showed that both NAFLD- and MASLD-based definitions were significantly associated with elevated CRC risk, with hazard ratios (HRs) ranging from 1.16 to 1.32 depending on liver fibrosis status [7].
Some evidence suggests that CRC risk may be particularly elevated in individuals with liver fibrosis, a lean phenotype, diabetes, or hypertension [6,7,8]. However, these observations remain limited and are not consistent across studies. Furthermore, data from two large European registries and a recent Swedish nationwide analysis found that excess CRC risk appeared only in men, with no statistically significant increase observed among women with NAFLD or MASLD [9,10,11].
These findings suggest that MASLD may function not only as an independent risk factor but also as a risk modifier, with heterogeneous effects across demographic and metabolic strata. In some populations—particularly women—MASLD’s impact on CRC risk may be negligible. Clarifying whether MASLD confers excess CRC risk in women, and identifying which subgroups are disproportionately affected, is essential for improving risk stratification and guiding targeted preventive strategies.
Therefore, we used a nationwide, population-based female cohort to (i) quantify the overall association between MASLD and incident CRC and (ii) explore effect modification across predefined subgroups to clarify whether MASLD constitutes a CRC risk factor in women.
2. Materials and Methods
2.1. Database
This study established a nationwide retrospective cohort using integrated data provided by the Health and Medical Big Data Integration Platform, which consolidates records from the National Health Insurance Service (NHIS), Health Insurance Review and Assessment Service (HIRA), Korea Central Cancer Registry (KCCR), and Statistics Korea. Access to this population-level data was granted by the institutional ethics committee (local IRB number: KC23ZISI0410, 15 June 2023) and the governmental data access committee (Project No. 2023-00025). The database covers nearly the entire Korean population and provides longitudinal information on demographics, health-screening records, socioeconomic status, medical diagnoses, prescriptions, cancer registrations, and verified causes of death. However, owing to the limitations associated with large-scale data extraction, approximately one million women aged 40–59 years were randomly sampled by region from the overall eligible population. All personal identifiers were removed prior to the analysis. Given the retrospective nature of this study, the requirement for written informed consent was waived and was conducted in accordance with the applicable data protection regulations and ethical standards of the Republic of Korea.
2.2. Study Design and Enrollment Criteria
The study population consisted of women aged 40–59 years who underwent at least one national general health screening between 1 January 2013, and 31 December 2016. The date of the first screening was defined as the index date. As the NHIS database records only the year of examination and not the exact date, the year of the first screening was considered the year of cohort entry. The preceding year served as a washout period.
Follow-up for incident CRC (International Classification of Diseases, 10th Revision [ICD-10] codes: C18–C20) began on January 1 of the year after cohort entry and continued until 31 December 2021. The specific inclusion and exclusion criteria for study enrollment are summarized in Table 1. Diagnostic codes used for the exclusion criteria are listed in Supplementary Table S1.
2.3. Definition of Variables
Baseline demographic, clinical, socioeconomic, and lifestyle characteristics were obtained from NHIS records. Physical activity was assessed from responses to a health-screening questionnaire, which included the number of days participants engaged in walking, moderate-intensity physical activity, and vigorous-intensity physical activity per week. The total weekly physical activity score was calculated based on the estimated metabolic equivalent of task (MET) values assigned to each activity level. These scores were then categorized into quartiles to define physical activity levels for analysis.
Comorbidities, including hypertension, diabetes mellitus, dyslipidemia, and cardiovascular disease, were identified using ICD-10 codes [12] from inpatient and outpatient claims (Supplementary Table S1). To assess the overall burden of comorbidities, we used the Charlson Comorbidity Index (CCI) [13], a validated scoring system widely used to quantify the severity of chronic conditions. Each comorbidity category was assigned a weighted score, and the total CCI score was included in the multivariable analyses to adjust for the potential confounding effects of chronic diseases (Supplementary Table S2).
2.4. Assessment of MASLD
MASLD was defined according to the latest international criteria and included hepatic steatosis, metabolic dysfunction, and alcohol consumption. Participants were classified as having MASLD if they had a documented diagnosis of fatty liver (ICD-10 code: K76.0) or met the specific criteria based on the Hepatic Steatosis Index (HSI). The detailed components and thresholds used for the MASLD definition are summarized in Table 2.
The HSI is a widely validated prediction model that incorporates the aspartate transaminase/alanine transaminase ratio, BMI, sex, and the presence of diabetes mellitus. The HSI has shown good predictive performance for fatty liver in the Korean population, with an area under the receiver operating characteristic curve of 0.82 (95% confidence interval [CI] 0.81–0.83) at values above 36 [14].
Although the FLI, similar to the HSI, is a useful quantitative tool for predicting the presence of MASLD without imaging studies or biopsies [15], its reliance on a greater number of variables, such as waist circumference, triglycerides, and gamma-glutamyl transferase levels, led us to use only the HSI to identify the MASLD group in our study.
2.5. Study Outcomes
The primary outcome was incident CRC. In addition, the potential for effect modification was evaluated across pre-specified subgroups, including age, dyslipidemia, waist circumference, BMI category, smoking status, physical activity, alcohol intake, and comorbidity burden.
2.6. Statistical Analyses
Baseline demographic and clinical characteristics were compared using Student’s t-test for continuous variables and the chi-squared test for categorical variables. The cumulative incidence of CRC was estimated using Kaplan–Meier curves and compared using the log-rank test. Univariate and multivariate Cox proportional hazards regression models were used to evaluate factors associated with the outcomes. The proportional hazards assumption was verified using log-log survival plots. The analysis was conducted using a complete-case approach, including only participants with no missing values for the key covariates. A multivariate model was built using the stepwise method and adjusted for relevant covariates. Statistical significance was defined as two-sided p < 0.05. Effect modification was evaluated using cross-product interaction terms in the Cox model. Interactions were considered suggestive at p for interaction < 0.15, and those with opposite directions of association between strata were interpreted as indicative of effect modification [16]. All statistical analyses were performed using the SAS software (version 9.3; SAS Institute Inc., Cary, NC, USA).
3. Results
3.1. Baseline Characteristics
From an initial database of 570,690 women aged 40–59 years who underwent the National Health Screening between 2013 and 2016, 87,289 were excluded due to duplicate records, missing key variables, heavy alcohol use (≥20 g/day), viral hepatitis or cirrhosis, organ transplantation, or a prior cancer diagnosis. The final study cohort included 483,401 participants, of whom 128,642 (26.6%) had MASLD, and 354,759 (73.4%) did not (Figure 1).
Compared with the non-MASLD group, women in the MASLD group were more likely to be aged 50–59 years (69.8% vs. 62.7%; p < 0.001), severely obese (BMI ≥ 30 kg/m^2^: 25.2% vs. 0.1%; p < 0.001), and centrally obesity (waist circumference ≥ 85 cm: 55.3% vs. 7.8%; p < 0.001). They were also more likely to report low physical activity (lowest quartile: 27.7% vs. 22.9%; p < 0.001) and to be in the lower income strata (32.0% vs. 34.7%; p < 0.001). The MASLD group also had higher rates of diabetes mellitus (22.3% vs. 6.1%), hypertension (37.2% vs. 17.0%), dyslipidemia (35.0% vs. 22.2%), and cardiac disease (14.0% vs. 9.0%) (all p < 0.001), resulting in a higher mean CCI (3.18 ± 1.40 vs. 2.69 ± 1.04; p < 0.001) (Table 3).
3.2. Factors Associated with Incident CRC
The median follow-up was 7.51 years (interquartile range, 7.51–8.51) in both groups. During this period, 2432 incident CRC cases were identified: 702/128,642 (0.55%) in the MASLD group and 1730/354,759 (0.49%) in the non-MASLD group. Kaplan–Meier analysis demonstrated a significantly higher cumulative incidence of CRC in the MASLD group than that in the non-MASLD group (Figure 2, log-rank p = 0.006). The estimated cumulative risks of CRC at 3, 5, and 7 years were 0.18% vs. 0.17%, 0.32% vs. 0.30%, and 0.47% vs. 0.43% in the MASLD and non-MASLD groups, respectively. In the multivariate Cox proportional hazards model (Table 4), the following variables were independently associated with CRC: age 50–59 years (HR 1.508, 95% CI 1.378–1.650), p < 0.001), highest income decile (8–10) (HR 0.755, 95% CI 0.579–0.986, p = 0.038), current smoker (HR 1.329, 95% CI 1.068–1.653, p = 0.01), and MASLD (HR 1.095, 95% CI 1.003–1.195, p = 0.044).
3.3. Subgroup Analyses
As shown in Figure 3, an effect modification was detected for age, dyslipidemia status, and waist circumference. In the stratified multivariate models, MASLD was independently associated with CRC risk among women aged 40–49 years (HR 1.261, 95% CI 1.049–1.501, p = 0.009), those without dyslipidemia (HR 1.146, 95% CI 1.030–1.276, p = 0.012), and those with waist circumferences < 85 cm (HR 1.151, 95% CI 1.020–1.298, p = 0.022). In the complementary strata (age 50–59 years, presence of dyslipidemia, or waist circumference ≥ 85 cm), MASLD was not significantly associated with CRC risk.
4. Discussion
This nationwide analysis of 483,401 Korean women showed that MASLD was independently associated with a significant increase in incident CRC, even after adjusting for demographic, socioeconomic, and metabolic covariates. Importantly, this excess hazard was not uniform; it was concentrated in women aged 40–49 years, those without clinically coded dyslipidemia, and those with a waist circumference of <85 cm. These findings extend current knowledge by demonstrating the heterogeneous nature of MASLD’s oncogenic impact; while the associated risk is modest, it may suggest meaningful public health implications when considered on a nationwide scale.
Previous studies reported mixed conclusions regarding the association between MASLD (or NAFLD) and CRC in women. In the Swedish National Patient Registry [10], which included approximately 80,000 individuals with NAFLD, the HR for CRC was 1.54 (95% CI 1.13–2.08) in men but only 1.21 (0.84–1.73) in women. Similarly, a Korean hospital screening cohort of 25,947 participants showed a similar sex contrast, with HRs of 2.21 (1.26–3.87) for men and 1.00 (0.37–2.70) for women [9]. By contrast, a Korean national screening study [7] of 8.9 million adults reported significant associations in both sexes, implying that null results in smaller cohorts of women may reflect limited statistical power.
Apart from the differences in case numbers between studies, several methodological and biological factors may explain these discrepant findings. First, nearly two-thirds of women in our cohort were aged ≥ 50 years and therefore peri- or post-menopausal, whereas many earlier cohorts of women were predominantly premenopausal. Estrogen deficiency accelerates visceral fat accumulation, hepatic insulin resistance, and systemic inflammation, all of which promote colorectal carcinogenesis [17,18]. The older hormonal profile in our cohort may therefore provide a plausible biological setting in which the MASLD-CRC association becomes detectable, whereas younger estrogen-replete populations may mask this effect. Second, our study applied a stricter definition of MASLD than that used in earlier women-null reports. We required hepatic steatosis, indicated by an HSI of ≥36, in combination with at least one metabolic abnormality, whereas previous studies often relied solely on ultrasound findings or registry codes to define NAFLD [9,10]. This stricter definition reduces the inclusion of metabolically benign steatosis and enriches for women in whom hepatic fat coexists with systemic metabolic stress. Such enrichment aligns more closely with the biological pathways that promote colorectal tumorigenesis, providing further rationale for the positive association observed in our data.
Importantly, the association between MASLD and CRC differed according to age. In our cohort, women aged 40–49 years had an HR of 1.254 (95% CI 1.054–1.493), whereas those aged 50–59 years showed no significant increase (1.047 [0.946–1.159]). Similarly, a prospective Chinese cohort of 63,696 adults found that NAFLD diagnosed before the age of 45 years carried the greatest digestive cancer risk, with the hazard attenuating at older ages. Early-onset MASLD may lead to prolonged hepatic lipotoxicity and chronic subclinical inflammation, thereby providing a longer window for malignant transformation. Mechanistic studies of early-onset CRC (EOCRC) strengthen this interpretation [19]. Population data from the United States and Europe have shown a rapid increase in EOCRC, which parallels increasing obesity and metabolic dysfunction in younger adults. Previous studies have linked hyperinsulinemia and activation of the insulin-like growth factor axis to enhanced proliferation and reduced apoptosis of colonic epithelial cells [20]. Recent evidence indicated that NAFLD is independently associated with a higher risk of EOCRC, particularly in the left colon and rectum [21].
In MASLD, excessive production of secondary bile acids (BAs), accumulation of deoxycholic acid, and gut microbial dysbiosis can promote intestinal inflammation and carcinogenesis [22]. However, statins, which are commonly prescribed to patients with dyslipidemia, inhibit HMG-CoA reductase, thereby lowering the production of secondary BAs and reducing the intestinal BA load [23]. This modulation may attenuate β-catenin activation and oxidative DNA damage in the colon. Large meta-analyses have shown that statin therapy, routinely prescribed to most individuals with ICD-10 code E78 (dyslipidemia), reduces CRC incidence by approximately 10% [24]. This mechanism could partly explain why the carcinogenic effect of MASLD was not evident in dyslipidemic women but became apparent in those without dyslipidemia. Furthermore, women classified as normolipidemic are generally untreated and may carry atherogenic remnant-rich lipoproteins that are not captured by standard lipid panels [25]. These findings imply that pharmacological lipid-lowering therapy can mask the MASLD-linked CRC risk, whereas residual metabolic dyslipidemia may expose this risk.
Finally, MASLD predicted CRC only among women with a waist circumference < 85 cm. While visceral obesity is a well-established driver of CRC, many studies have shown that NAFLD in lean or non-obese individuals confers a stronger CRC risk than NAFLD in their obese counterparts [26,27]. Lean NAFLD patients often exhibit insulin resistance and dyslipidemia, and these metabolic abnormalities may promote colorectal carcinogenesis through activation of the insulin–IGF signaling pathway [28]. However, it remains difficult to clearly explain why the MASLD–CRC association was not observed in obese women. One possible explanation is that in women with central obesity, the baseline CRC risk is already high, leaving little additional risk attributable to hepatic fat.
Our findings imply that MASLD status may serve as an additional marker for risk stratification, particularly in women aged 40–59 years. First, incorporating simple steatosis indices into routine assessments could assist in identifying women with potential metabolic risks. This would facilitate the identification of women with occult metabolic risks who might otherwise be overlooked, such as those with a normal BMI or waist circumference but significant hepatic steatosis. Implementing these indices as a primary screening tool could enhance the “triage” process for CRC surveillance in primary care settings. Second, while Korean screening guidelines are lowering the initiation age—from 50 in the national program to 45 in academic recommendations—to address the rising incidence of EOCRC, metabolic risk assessment is not yet integrated. Diversifying the screening initiation age or customizing screening intensity through risk stratification based on metabolic markers, such as MASLD, could potentially enhance early detection and optimize resource allocation. Third, clinicians should continue to emphasize metabolic optimization—including lifestyle modification and management of dyslipidemia—which may help mitigate CRC risk in MASLD-positive women who appear otherwise metabolically healthy. Recognizing these associations could contribute to a more comprehensive approach to CRC prevention in this specific population.
Future research should address not only the biological mechanisms linking MASLD to colorectal carcinogenesis but also conduct studies in diverse cohorts to reach a consensus on the heterogeneity of CRC risk across subgroups. Furthermore, the cumulative impact of MASLD duration and severity is worth exploring. Such evidence will be essential to establishing precise risk stratification models and enhancing our understanding of MASLD-associated carcinogenesis.
This study had several limitations that warrant consideration. First, its retrospective, claims-based design precludes causal inference and allows for residual or unmeasured confounding factors despite extensive covariate adjustment. Second, key exposures, including alcohol intake, physical activity, and medication use, were captured only at baseline; changes during follow-up could not be modeled, potentially leading to misclassification. Third, this cohort comprised Korean women aged 40–59 years who participated in the national screening program; therefore, the findings may not be generalizable to men, other ethnicities, or younger and older age groups. Fourth, we lacked data on menopausal hormone therapy, dietary patterns, and gut microbiome composition, which prevented the assessment of their potential effect-modifying roles. Finally, we were unable to account for competing risks of non-CRC mortality, which might have led to an overestimation of the cumulative incidence of CRC. These limitations should be weighed against the strengths of our large nationally representative cohort when interpreting the clinical implications of MASLD-associated CRC risk in women.
5. Conclusions
In conclusion, in this large nationwide cohort of Korean women, MASLD was independently associated with a 10% increase in incident CRC. Excess risk was confined to women aged 40–49 years, those without dyslipidemia, and those with a waist circumference of <85 cm, underscoring the substantial heterogeneity in MASLD-related carcinogenesis. These findings highlight the importance of age- and metabolism-specific risk stratification, suggesting that MASLD status could serve as a marker to identify women who may benefit from closer clinical attention and metabolic optimization.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Eslam M. Sanyal A.J. George J. on behalf of the International Consensus Panel MAFLD: A Consensus-Driven Proposed Nomenclature for Metabolic Associated Fatty Liver Disease Gastroenterology 202015819992014.e 110.1053/j.gastro.2019.11.31232044314 · doi ↗ · pubmed ↗
- 2Younossi Z.M. Koenig A.B. Abdelatif D. Fazel Y. Henry L. Wymer M. Global epidemiology of nonalcoholic fatty liver disease-Meta-analytic assessment of prevalence, incidence, and outcomes Hepatology 201664738410.1002/hep.2843126707365 · doi ↗ · pubmed ↗
- 3Lee H.H. Lee H.A. Kim E.J. Kim H.Y. Kim H.C. Ahn S.H. Lee H. Kim S.U. Metabolic dysfunction-associated steatotic liver disease and risk of cardiovascular disease Gut 20247353354010.1136/gutjnl-2023-33100337907259 · doi ↗ · pubmed ↗
- 4Bray F. Laversanne M. Sung H. Ferlay J. Siegel R.L. Soerjomataram I. Jemal A. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries CA Cancer J. Clin.20247422926310.3322/caac.2183438572751 · doi ↗ · pubmed ↗
- 5Chen J. Bian D. Zang S. Yang Z. Tian G. Luo Y. Yang J. Xu B. Shi J. The association between nonalcoholic fatty liver disease and risk of colorectal adenoma and cancer incident and recurrence: A meta-analysis of observational studies Expert. Rev. Gastroenterol. Hepatol.20191338539510.1080/17474124.2019.158014330791768 · doi ↗ · pubmed ↗
- 6Mc Henry S. Zong X. Shi M. Fritz C.D. Pedersen K.S. Peterson L.R. Lee J.K. Fields R.C. Davidson N.O. Cao Y. Risk of nonalcoholic fatty liver disease and associations with gastrointestinal cancers Hepatol. Commun.202263299331010.1002/hep 4.207336221229 PMC 9701484 · doi ↗ · pubmed ↗
- 7Lee H. Lee H.W. Kim S.U. Kim H.K. Metabolic dysfunction-associated fatty liver disease increases colon cancer risk: A nationwide cohort study Clin. Transl. Gastroenterol.202213 e 0043510.14309/ctg.000000000000043535080508 PMC 8806363 · doi ↗ · pubmed ↗
- 8Li Y. Ma X.M. Jia J.G. Cao L.Y. Association between metabolic dysfunction-associated steatotic liver disease, metabolic dysfunction subtypes and risk of colorectal cancer: A prospective cohort study Clin. Res. Hepatol. Gastroenterol.20254910257310.1016/j.clinre.2025.10257340097070 · doi ↗ · pubmed ↗
