Identifying sorafenib benefit among patients with hepatocellular carcinoma: A transcriptomic and genomic approach
Sun Young Yim, Hayeon Kim, Tae Hyung Kim, Sang-Hee Kang, Youngwoo Lee, Eunho Choi, Yang Jae Yoo, Seong Hee Kang, Young-Sun Lee, Young Kul Jung, Yeon Seok Seo, Hyung Joon Yim, Jong Eun Yeon, Kyung Suk Yang, Yitao Tang, Bowha Sohn, Yun Seong Jeong, Hyewon Park, Han Liang

TL;DR
This study identifies a 50-gene signature called KUSS50 that can predict which patients with liver cancer will benefit from sorafenib treatment, helping to personalize therapy and improve outcomes.
Contribution
The study introduces KUSS50, a novel 50-gene signature for predicting sorafenib benefit in hepatocellular carcinoma.
Findings
The KUSS50 gene signature showed high predictive accuracy in identifying sorafenib beneficiaries (AUC: 87.1% and 90.8% in validation cohorts).
KUSS50-defined subtypes are associated with distinct molecular features like ferroptosis activation and β-catenin mutations.
The KUSS50 signature enables personalized treatment selection and may guide new therapeutic strategies for HCC.
Abstract
Sorafenib has been a cornerstone of hepatocellular carcinoma (HCC) therapy; however, its efficacy is limited, and identifying patients who will benefit from sorafenib is challenging. We aimed to identify predictive biomarkers of sorafenib benefit in patients with HCC. Gene expression data from 33 HCC tumors treated with sorafenib were analyzed to construct a prediction model aimed at identifying patients with greater benefit from sorafenib treatment. The robustness of the predictor was validated using gene expression data from two phase III clinical trials, IMbrave150 and STORM. The analysis of transcriptome data revealed a 50-gene signature, the KUSS50 (Korea University Sorafenib Signature with 50 genes), that exhibited high predictive power in identifying patients who benefited from sorafenib treatment in a training cohort. Validation in two independent cohorts – IMbrave150 (n = 48)…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHepatocellular Carcinoma Treatment and Prognosis · Cancer, Lipids, and Metabolism · Kruppel-like factors research
Introduction
Hepatocellular carcinoma (HCC) is a highly aggressive and lethal liver cancer, accounting for a notable portion of cancer-related deaths worldwide.1^,^2 HCC primarily affects individuals with chronic liver diseases, such as cirrhosis, hepatitis B or C virus infection, and metabolic dysfunction–associated steatotic liver disease.[3], [4], [5], [6] Despite advances in diagnostic techniques and treatment strategies, HCC remains a significant global health burden, with a 5-year survival rate of only 18%.7
Sorafenib, a multikinase inhibitor, has been the cornerstone of systemic therapy for advanced HCC since its approval by the US FDA in 2007.8^,^9 Sorafenib primarily targets the serine-threonine kinases Raf-1 and B-Raf as well as the receptor tyrosine kinases vascular endothelial growth factor receptor and platelet-derived growth factor receptor.10 By inhibiting these kinases, sorafenib restrains tumor cell proliferation, angiogenesis, and tumor cell survival, ultimately leading to improved survival for patients with advanced HCC. Despite the initial success of sorafenib in HCC treatment, its long-term efficacy remains limited owing to several factors, including intrinsic and acquired resistance, tumor heterogeneity, and the complex tumor microenvironment.[11], [12], [13], [14] Additionally, sorafenib is associated with various adverse effects, including fatigue, diarrhea, hand and foot skin reactions, and hypertension, which notably reduce patients' quality of life and adherence to treatment.12 Furthermore, not all patients derive equal benefit from sorafenib therapy,8^,^9^,^11^,^12 underscoring the need for reliable biomarkers to predict treatment outcomes.
To address these limitations and improve patient outcomes, several studies have identified predictive biomarkers to guide patient selection and optimize sorafenib use. For instance, Ang-2 was identified as a potential biomarker of poor outcome for patients with HCC given sorafenib.15 Furthermore, VEGFA amplification in HCC cells was associated with better outcomes for patients who underwent sorafenib treatment.16 Additionally, polymorphisms in genes involved in sorafenib metabolism, such as polymorphisms relevant to UDP-glucuronosyltransferase 1A9, have been linked with sorafenib response and toxicity.17
Although several biomarkers for predicting sorafenib treatment benefit have been proposed, their clinical utility and reproducibility remain subject to validation in large multicenter cohorts. Challenges such as interpatient variability, tumor heterogeneity, and the dynamic nature of treatment response underscore the need for robust and reliable biomarkers that can accurately stratify patients with HCC as those who would and those who would not benefit from sorafenib treatment, guiding treatment decisions. Moreover, integrating multiple biomarkers into predictive models or nomograms may bolster the predictive precision and clinical usefulness of such models and nomograms, providing clinicians with valuable tools for treatment selection and monitoring. In the present investigation, we aimed to address these critical needs by analyzing the transcriptome data of HCC samples collected during prospective clinical trials. Using an in-depth analysis, we sought to identify predictive biomarkers essential for optimizing sorafenib treatment strategies and ultimately improving patient outcomes.
Patients and methods
Patients, sample collection, and RNA sequencing
Demographic information, clinical data, and tissue samples were obtained for 33 patients who had undergone sorafenib treatment from 2010 to 2018 at Korea University (KU) Hospital (Table S1), under the approval of the Institutional Review Board (IRB No. 2017GR0176 and 2018AN0445). Most of the patients were male (81.8%), had hepatitis B virus infection (72.7%), and presented with extrahepatic metastasis (90.9%) at the initiation of sorafenib. Among the patients whose disease progressed during sorafenib treatment, 42.4% received second-line chemotherapy. The median follow-up duration was 12.6 months (range, 1.3-69.5 months).
All patients underwent liver biopsy analysis prior to sorafenib use, and total RNA was obtained from their formalin-fixed, paraffin-embedded tissue blocks. Total RNA was extracted from slides containing tissue sections from these blocks using a Purigen Biosystems Ionic Purification System, and the DV200 values (representing the percentage of fragments >200 nucleotides) exceeded 15%. mRNA expression data for HCCs in the study patients were subsequently generated on the Illumina NovaSeq 6000 platform, employing S4-xp-200 lanes with a sequencing depth of 50 million reads per sample. The quality of the raw reads was checked using FastQC (v0.11.5) and summarized using MultiQC (v1.7). FASTQ files were mapped to the human reference genome (GRCh38) using STAR (v2.7.10a), and gene expression levels were quantified using RSEM (v1.3.3) with default parameter settings. The resulting transcriptomic dataset is available in the NCBI Gene Expression Omnibus under accession number [GSE285625](GSE285625).
Gene expression and clinical data from validation cohorts
Gene expression and clinical data from two clinical trials (IMbrave150 and GO30140) were obtained from Genentech, and all datasets are available from the European Genome-Phenome Archive database (accession number EGAD00001008128).[18], [19], [20] The IMbrave150 (NCT03434379) trial was a phase III clinical study evaluating the safety and efficacy of atezolizumab plus bevacizumab vs. sorafenib as the primary treatment for unresectable HCC.18^,^21 Similarly, in the GO30140 (NCT02715531) trial, researchers investigated the safety and dosage of atezolizumab alone and combined with bevacizumab for patients with HCC.19 The key endpoints of these studies included the confirmed objective response rate in all patients who received the treatments and overall survival and progression-free survival in the intention-to-treat population, which were assessed using an impulse residue function according to RECIST v1.1.22 Combining the patient populations from both trials resulted in 253 patients in the atezolizumab-plus-bevacizumab arm, 47 patients in the atezolizumab-only arm, and 58 patients in the sorafenib arm with available gene expression data. Patients without clinical outcome data were excluded, leaving 247 patients in the atezolizumab-plus-bevacizumab arm, 43 patients in the atezolizumab-only arm, and 48 patients in the sorafenib arm for analysis.
Gene expression and clinical data from the BIOSTORM trial cohort were obtained from the Gene Expression Omnibus under accession number [GSE109211](GSE109211).23^,^24 The BIOSTORM cohort was a subset of the cohort in the phase III STORM trial focused on biomarker research. The National Cancer Institute proliferation (NCIP) subtype was used as an exemplar genomic subtype significantly associated with HCC prognosis, as previously described.[25], [26], [27] Additionally, genomic, histological, and clinical data – including information on mutations and copy number alterations – from The Cancer Genome Atlas (TCGA) HCC project were obtained and analyzed in accordance with prior studies.[28], [29], [30], [31]
Data analysis
Collected gene expression data were transformed and normalized as described previously.[25], [26], [27], [28], [29], [30], [31] BRB-ArrayTools (v4.6.1), a freeware program from the National Cancer Institute (http://brb.nci.nih.gov/BRB-ArrayTools.html), was used to analyze the data and build a predictive model.32 Cluster and TreeView were used to generate a heat map of gene expression data.33 All statistical analyses were performed in the R language (http://www.r-project.org).
To identify genes associated with the benefit of systemic sorafenib administration in patients with HCC, we selected genes whose expression was significantly different (p <0.05 [Student’s t test]) between patients in a KU training cohort who did (benefit group) and did not (non-benefit group) experience treatment benefit. The benefit group was defined as patients with controlled disease, which included objective response and stable disease evaluated according to RECIST v1.1. following sorafenib treatment, whereas the non-benefit group was defined as patients with progressive disease. To construct a Bayesian prediction model, we selected 50 genes whose expression was most significantly upregulated in the benefit group. The resulting 50-gene signature was named the KU Sorafenib Signature with 50 genes (KUSS50).
Before independent gene expression data were pooled to perform each analysis, each gene expression dataset was independently standardized to have a mean of 0 and standard deviation of 1 as described previously.[25], [26], [27], [28], [29], [30], [31] To estimate the likelihood that a specific human HCC would exhibit a particular gene expression pattern, we employed a Bayesian compound covariate prediction algorithm based on training sets. The gene expression data from the KU cohort, which served as the training sets, were combined to construct a predictive model using the Bayesian compound covariate prediction framework. To assess the model's robustness, we conducted leave-one-out-cross-validation during the training process. A Bayesian probability threshold of 0.7 yielded the highest positive prediction value of 0.917 and a negative prediction value of 0.81.
Receiver-operating characteristic (ROC) curve analyses were carried out to estimate the discriminatory power of the KUSS50. The AUC was calculated, which ranged from 0.5 (for a non-informative predictive marker) to 1 (for a perfect predictive marker), and a bootstrap method (1,000 resampling) was used to calculate the 95% CI for the AUC. Multivariate analysis using Cox regression was performed for overall survival to identify factors that could predict an overall survival benefit from sorafenib.
Tumor purity in TCGA samples was assessed using three complementary approaches. Tumor purity was defined as the proportion of tumor cells relative to the total cell population within a given sample. First, histopathological evaluation was performed using conventional H&E staining. Board-certified pathologists at the Biospecimen Core Resource of Nationwide Children's Hospital reviewed digital H&E-stained slides and visually estimated tumor cellularity in representative regions with minimal necrosis or hemorrhage.31 Second, we applied the ESTIMATE algorithm, which infers stromal and immune cell infiltration based on the gene expression profiles of 141 immune-related and 141 stromal-related genes.34 In this RNA-based method, higher stromal or immune scores reflect greater non–tumor cell content, and tumor purity was calculated as one minus the sum of the normalized stromal and immune scores. Third, tumor purity was estimated using ABSOLUTE, a DNA-based method that models allele-specific somatic copy-number alterations and heterozygosity.35 This approach provided a model-derived estimate of tumor purity for each TCGA sample.
Results
Gene expression signature associated with benefit from sorafenib treatment
To identify potential biomarkers that could predict which patients would benefit from treatment with sorafenib, we categorized patients in the KU cohort into two groups based on their responses to the drug. The non-benefit group consisted of 18 patients who received sorafenib and experienced disease progression, whereas the benefit group consisted of 15 patients who received sorafenib and did not experience disease progression. For all patients, the median overall survival duration was 12.6 months. As expected, there was a statistically significant difference in overall survival between the benefit and non-benefit groups (Fig. S1). We sought to identify genes whose expression levels differed significantly (p <0.05 [Student’s t test]) between the benefit and non-benefit groups. Of these differentially expressed genes, we focused on those that exhibited higher expression in the benefit group than in the non-benefit group. This strategic selection of upregulated genes in the benefit group was intended to facilitate the development and validation of a predictive model for independent patient cohorts (Fig. S2). Through this approach, we identified the 50 most upregulated genes in the benefit group as candidate biomarkers for predicting sorafenib benefit (Fig. 1A and S3, Table S2). We defined this set of 50 genes as the KUSS50. We then used the KUSS50 to construct a Bayesian prediction model aimed at identifying patients with HCC who would potentially benefit from sorafenib treatment (Fig. 1B). To assess the robustness of the KUSS50 in predicting sorafenib benefit within the KU training cohort, we performed ROC analysis. The AUC for the Bayesian probability derived from the KUSS50 model was remarkably high at 92.6% (95% CI 82.2-99.3%; p = 2.6 × 10^-6^), demonstrating the signature's significant ability to discriminate between the benefit and non-benefit patients in this cohort (Fig. 1C). Multivariate analysis using Cox regression was performed to identify independent factors associated with benefit from sorafenib and revealed that the KUSS50, along with AFP level and patient performance status, remained significantly associated with sorafenib benefit (Table S3).Fig. 1. Genes associated with a benefit of sorafenib treatment in patients with HCC, and the development of a KUSS50 prediction model.(A) Heatmap showing the expression patterns of the 50 genes constituting the KUSS50. Genes were identified in the KU training cohort based on their differential expression between the benefit and non-benefit groups using the Student’s t test (p <0.05) and ranked by fold-change (benefit/non-benefit). Bars indicate relative expression (z-score). (B) Schematic illustration of the Bayesian compound covariate predictor model built with the KUSS50 gene set. The model was trained with the KU cohort (n = 33) and validated with two independent phase III clinical trial datasets (IMbrave150, n = 48; BIOSTORM, n = 67) using LOOCV. (C) Receiver-operating characteristic curve showing the performance of the Bayesian predictor probability for distinguishing patients with HCC who benefited from sorafenib treatment in the KU training cohort during LOOCV. The AUC was 0.926 (95% CI 0.822–0.993), with statistical significance assessed by a bootstrap test with 1,000 resamplings (p = 2.6 × 10^-6^). The AUC and 95% CI were computed using the pROC package (bootstrap method). HCC, hepatocellular carcinoma; KU, Korea University; KUSS50, Korea University Sorafenib Signature with 50 genes; LOOCV, leave-one-out cross-validation.Fig. 1
Validation of the KUSS50 in the IMbrave150 first-line treatment cohort
After identifying that the KUSS50 was significantly associated with favorable outcomes in patients with HCC receiving systemic sorafenib therapy, we attempted to validate the predictive capability of the KUSS50-based Bayesian probability model in an independent cohort. We carried out this validation using the gene expression data of the HCC tissues of patients enrolled in the sorafenib arm of the phase III IMbrave150 clinical trial.18^,^20 We evaluated the predictive performance of the KUSS50-based Bayesian probability model by assessing its ability to accurately identify patients who benefited from sorafenib use, defined as those who had at least a partial response. Remarkably, the AUC calculated in the ROC analysis of the IMbrave150 sorafenib arm was 87.1% (95% CI 75.8-96.1%; p = 1.8 × 10^-4^) (Fig. 2A). This AUC value was similar to that for the original KU training cohort, underscoring the robustness of the KUSS50 model. To further validate the KUSS50's predictive power, we stratified patients in the IMbrave150 sorafenib arm based on a predefined Bayesian probability cut-off of 0.7 (Fig. 2B). This stratification revealed a significant statistical association (p = 0.003 [chi-square test]) between the KUSS50-predicted subtypes and clinical benefit of sorafenib use (Table 1; Fig. 2C). Patients sorted into the benefit group by the KUSS50 model exhibited a substantially higher rate of achieving at least a partial response than those sorted into the non-benefit group.Fig. 2. Significant correlation between the KUSS50 and improved outcomes in patients with HCC who underwent sorafenib treatment in the IMbrave150 trial cohort.(A) Receiver-operating characteristic curve illustrating the performance of the KUSS50-based Bayesian probability model for identifying patients who benefited from sorafenib treatment (i.e. who achieved at least a partial response). The AUC was 0.871 (95% CI 0.758–0.961), and statistical significance was determined using a bootstrap test with 1,000 resamplings (p = 1.8 × 10^-4^). (B) Distribution of the KUSS50 Bayesian probability scores for each patient in the IMbrave150 cohort. The stratification cut-off of 0.7 was predetermined from LOOCV in the KU cohort and used to separate predicted responders (benefit) from non-responders (no benefit). (C) Stacked bar plot showing the proportions of patients in each response category predicted by the KUSS50-based model. Among the 10 predicted responders, 9 (90%) achieved clinical benefit, whereas among the 38 predicted non-responders, 12 (31.5%) did (p = 0.003; Fisher exact test). HCC, hepatocellular carcinoma; KU, Korea University; KUSS50, Korea University Sorafenib Signature with 50 genes; LOOCV, leave-one-out cross-validation; NR, no response; R, responded to treatment by achieving at least a partial response.Fig. 2. Table 1Contingency table depicting the significant association of the KUSS50-based model with sorafenib benefit in the IMbrave150 cohort.Table 1. PredictionResponder (Partial response or better)Non-responderTotalBenefit91221Non-benefit12627Total103848p = 0.003 by χ^2^ test.KUSS50, Korea University Sorafenib Signature with 50 genes.
Notably, when applied to the atezolizumab-only or atezolizumab-plus-bevacizumab treatment arms in the IMbrave150 trial, the KUSS50-predicted subtypes were not significantly associated with treatment benefit (Tables S4 and S5). This observation underscores the high specificity of the KUSS50, which appears to predict the benefit of sorafenib, but not that of other therapeutic regimens, in HCC. This validation in the independent IMbrave150 cohort not only confirmed the predictive capability of the KUSS50 model but also highlighted the signature’s remarkable specificity for sorafenib benefit, further reinforcing its potential clinical use in optimizing patient selection for the targeted therapy.
Validation of the KUSS50 in the BIOSTORM adjuvant treatment cohort
Encouraged by the compelling correlation observed between the KUSS50 predictor and positive treatment outcomes in patients receiving first-line sorafenib therapy for HCC, we aimed to expand the validation process by determining whether the KUSS50 predictor could effectively identify patients who would benefit from receiving sorafenib in the adjuvant setting following primary treatment. To accomplish this, we used gene expression data from the BIOSTORM cohort. The BIOSTORM cohort is a subset of patients from the phase III STORM trial, which investigated potential biomarkers in patients with HCC receiving adjuvant therapy.23^,^24
We applied the KUSS50-based model to the gene expression data from the BIOSTORM cohort and evaluated its ability to accurately identify patients who benefited from adjuvant sorafenib use (defined as responders [recurrence-free patients] in the BIOSTORM study). Remarkably, the AUC from the ROC analysis was 90.8% (95% CI 81.5-97.5%; p = 1.0 × 10^-7^) (Fig. 3A). This AUC value closely mirrored those for both the original KU training cohort and the IMbrave150 sorafenib-only cohort, further validating the robustness of the KUSS50 predictor. To corroborate the predictive power of the KUSS50 model, we stratified patients in the BIOSTORM cohort based on a predefined Bayesian probability cut-off of 0.7 (Fig. 3B). This stratification revealed a highly significant association (p = 1.14 × 10^-8^ [chi-square test]) between the KUSS50 prediction and the clinical benefit derived from adjuvant sorafenib treatment (Table 2; Fig. 3C). Patients assigned to the benefit group by the KUSS50 model exhibited substantially better outcomes than those in the non-benefit group.Fig. 3. Significant correlation between the KUSS50 and improved outcomes in patients with HCC who underwent sorafenib treatment in the BIOSTORM trial.(A) Receiver-operating characteristic curve evaluating the performance of the KUSS50-based Bayesian probability model in identifying patients with HCC who benefited from sorafenib treatment in the BIOSTORM trial cohort. The AUC was 0.908 (95% CI 0.815–0.975), and statistical significance was determined using a bootstrap test with 1,000 resamplings (p = 1.0 × 10^-7^). (B) Distribution of the KUSS50 Bayesian probability scores for each patient in the BIOSTORM cohort. The cut-off probability of 0.7, established during leave-one-out cross-validation in the KU training cohort, was used to stratify patients into predicted benefit and non-benefit groups. (C) Stacked bar plot showing the proportions of patients in each response category predicted by the KUSS50 in the BIOSTORM cohort. Among 21 patients predicted to have a benefit, 18 (85.7%) achieved clinical benefit, whereas of 46 patients predicted to have no benefit, 5 (10.9%) achieved clinical benefit (p = 1.14 × 10^-8^; Fisher exact test). HCC, hepatocellular carcinoma; KU, Korea University; KUSS50, Korea University Sorafenib Signature with 50 genes; NR, no response; R, responded to treatment by achieving recurrence-free.Fig. 3. Table 2Contingency table depicting the significant association of the KUSS50-based model with sorafenib benefit in the BIOSTORM cohort.Table 2. PredictionResponder (recurrence-free)Non-responder (recurrence)TotalBenefit18523Non-benefit34144Total214667p = 1.14 x 10^-8^ by χ^2^ test.KUSS50, Korea University Sorafenib Signature with 50 genes.
This finding in the BIOSTORM cohort provided further compelling evidence that the KUSS50 predictor can effectively identify patients with HCC who are likely to benefit from treatment with sorafenib not only in the first-line setting but also in the adjuvant setting following primary treatment. The KUSS50 model performed consistently well across multiple independent cohorts receiving first-line or adjuvant sorafenib, underscoring its potential clinical utility in guiding personalized treatment decisions and optimizing patient selection for sorafenib therapy.
Genomic and histological characteristics associated with KUSS50-defined subtypes
In subsequent analyses, we assessed genomic characteristics associated with KUSS50-defined subtypes in the TCGA cohort to gain further insight into their underlying biology. Initially, we compared the genomic copy-number alterations between the two KUSS50-defined subtypes, finding no discernible differences (Fig. 4A). However, we observed a notable difference in the mutation burden between the subtypes, with the non-benefit subtype exhibiting markedly higher mutation rates than its counterpart (Fig. 4B).Fig. 4. Genomic alterations and purity scores associated with the KUSS50-defined subtypes.(A) Box plot showing the fraction of the genome altered by copy-number gain and loss in HCC samples from the TCGA-LIHC dataset (n = 371). Differences between the KUSS50-defined benefit and non-benefit groups were assessed using the Student’s t test (p = 0.31). (B) Comparison of non-synonymous mutation burdens between KUSS50-defined subtypes in the TCGA-LIHC cohort (n = 367). The benefit group showed a significantly higher mutation rate than the non-benefit group (p = 2.1 × 10^-4^, Student’s t test). (C) Box plots of tumor purity estimated by three independent methods: pathologic assessment of tumor-cell percentages using IHC, computational inference using the ESTIMATE algorithm, and computational inference using the ABSOLUTE algorithm. Tumor purity was significantly higher in the non-benefit group. Differences were assessed using the Student’s t test. Box boundaries represent the 25th–75th percentiles, center lines indicate the means, and whiskers denote the 10th and 90th percentiles. HCC, hepatocellular carcinoma; IHC, immunohistochemistry; KUSS50, Korea University Sorafenib Signature with 50 genes; TCGA, The Cancer Genome Atlas.Fig. 4
We then proceeded to identify somatic mutations that were significantly associated with the KUSS50-defined subtypes (Fig. S4). Whereas mutations of TP53 exhibited no notable correlations with the subtypes, mutations of CTNNB1 (encoding β-catenin) and AXIN1 were notably enriched in the non-benefit subtype. This observation suggests the involvement of the β-catenin pathway in conferring resistance to sorafenib, which is in alignment with prior studies demonstrating that the activation of β-catenin is linked with sorafenib resistance.[36], [37], [38] Notably, mutations of the ALB and APOB genes, which encode plasma proteins, were significantly associated with the non-benefit subtype, suggesting a connection between loss of plasma proteins and resistance to sorafenib. Conversely, mutations of BAP1 and FAM47A were predominantly associated with the benefit subtype.
Furthermore, an analysis of H&E-stained tumor sections from the TCGA cohort demonstrated that the KUSS50-defined non-benefit subtype exhibited significantly higher tumor purity compared to the benefit subtype (Fig. 4C). This observation suggests that reduced stromal cell–cancer cell interactions underlie the limited responsiveness to sorafenib. Consistent with this, genome-based tumor purity estimates derived from the ESTIMATE (transcriptomic) and ABSOLUTE (copy-number) algorithms showed similar trends.34^,^35 As expected, the three independent methods for assessing tumor purity were highly concordant (Fig. S5), further validating the observed differences between the two KUSS50-defined subtypes (Fig. 4C).
Prognostic relevance of KUSS50-defined subtypes and the association with previously identified genomic subtypes
Given the substantial correlation between the KUSS50 and favorable outcomes in patients with HCC undergoing sorafenib treatment that we observed, we next assessed the prognostic significance of the KUSS50-defined subtypes of HCC. We conducted this analysis using data from patients in the TCGA HCC cohort, most of whom did not receive sorafenib.31 As a point of reference, in the TCGA cohort, we evaluated the prognostic relevance of the HCC subtypes previously identified by the NCIP predictor,[25], [26], [27] which have been shown to be associated with prognosis. As expected, the NCIP predictor–identified subtypes were significantly associated with prognosis in the TCGA patients (Fig. 5A). In contrast, the KUSS50-defined subtypes were not associated with prognosis in the TCGA cohort. This observation strongly demonstrates that although the KUSS50 could serve as a predictive marker for identifying patients with HCC who are likely to benefit from sorafenib administration, it lacks prognostic relevance in a general context in which sorafenib is not administered.Fig. 5. Association of the KUSS50-defined subtypes with previously identified genomic subtypes of HCC and with HCC prognosis.(A) Kaplan–Meier survival analysis showing OS for patients with HCC from the TCGA-LIHC cohort (n = 365), stratified by AJCC stage, NCIP classification, and the KUSS50 predictor. Survival differences between groups were assessed using the log-rank test.(B) Comprehensive comparative analysis of the KUSS50-defined subtypes with various genomic subtypes of HCC identified and characterized in previous studies. The color map illustrates “the distribution of previously identified subtypes, including NCIP subtypes, TCGA iCluster subtypes, Hoshida subtypes, INSERM subtypes, and hepatic stem cell subtypes, among patients with the KUSS50-defined benefit and non-benefit subtypes. AJCC, American Joint Committee on Cancer; dHCC, differentiated HCC; ECM, extracellular matrix; HCC, hepatocellular carcinoma; HP, high proliferation; HS1, hepatic stem 1; HS2, hepatic stem 2; iC1, iCluster1; iC2, iCluster2; iC3, iCluster3; KUSS50, Korea University Sorafenib Signature with 50 genes; LP, low proliferation; NCIP, National Cancer Institute proliferation; ROC, receiver-operating characteristic; OS, overall survival; PP, periportal; PV, perivenous; STEM, stem cell; TCGA, The Cancer Genome Atlas.Fig. 5
Building on this finding, we further explored the potential correlation between sorafenib sensitivity as defined by the KUSS50-defined subtypes and the previously identified genomic subtypes of HCC. Remarkably, the KUSS50-defined benefit subtype showed notable, but not exclusive, association with several well-established genomic subtypes of HCC. For instance, a substantial portion of benefit cases aligned with the iCluster 1 (iC1) group in the TCGA classification (Fig. S6).31 Similar trends were observed with Hoshida’s S1 and S2 subtypes39 and with the STEM (stem cell) and ECM (extracellular matrix) subtypes defined by the INSERM classification.40 Among the four INSERM subtypes, the KUSS50-defined benefit subtype had the strongest correlation with the ECM subtype and had a weaker yet significant correlation with the STEM subtype (Fig. S7). The benefit group also showed enrichment in hepatic stem cell–like subtypes (HS1 and HS2).29 However, this association was not absolute, as KUSS50-defined benefit cases were also distributed across non-stem–like subtypes. These findings suggest that while it has meaningful overlap with existing molecular classifications, the KUSS50 reflects a distinct aspect of sorafenib sensitivity that may span multiple biological programs.
Taken together, these findings highlight a compelling and consistent correlation between the benefit of sorafenib treatment according to the KUSS50 and specific genomic subtypes of HCC. The KUSS50-defined benefit subtype aligns with certain HCC subtypes across diverse genomic classification systems (Fig. 5B). This convergence of evidence reinforces the association between the benefit of sorafenib treatment and the underlying genomic landscape of HCC, suggesting that specific genomic features contribute to the observed sensitivity or resistance of HCC to sorafenib treatment.
Molecular characteristics of the KUSS50 subtypes
To gain insight into the underlying biological processes and molecular mechanisms associated with the KUSS50, we performed a gene network analysis using the KUSS50 genes. Ingenuity Pathway Analysis revealed that the cAMP response element–binding protein and G protein–coupled receptor signaling pathways were among the activated pathways in the benefit subtype (Fig. S8). Notably, the ferroptosis signaling pathway emerged as a highly activated pathway in the benefit subtype. Ferroptosis is a distinct form of regulated cell death characterized by the iron-dependent accumulation of lipid peroxides in cells.41^,^42 The activation of this pathway in the benefit subtype suggested that the increased baseline ferroptotic activity in HCC cells increased their sensitivity to the cytotoxic effects of sorafenib. Supporting this hypothesis, we found that ACSL4, a key metabolic gene that promotes ferroptosis, was significantly upregulated in the benefit subtype (Table S2; Fig. S3). ACSL4 plays a crucial role in the metabolism of polyunsaturated fatty acids,43 which are particularly susceptible to lipid peroxidation and are necessary for ferroptosis execution.44^,^45 The upregulation of ACSL4 in the benefit subtype suggests that elevated polyunsaturated fatty acid metabolism and increased susceptibility to ferroptosis contribute to the observed sensitivity of HCC to sorafenib treatment. In addition to the ferroptosis pathway, Ingenuity Pathway Analysis identified the stathmin-1 and phagosome formation pathways as being predicted to be activated in the benefit group. Stathmin-1 is a microtubule-destabilizing protein involved in the regulation of cellular processes such as cell cycle progression and migration.46
To investigate the biology underlying the non-benefit subtype, we performed a complementary upstream regulator analysis of the genes upregulated in this group (Table S6). This analysis identified CTNNB1 (β-catenin) as one of the most significantly activated upstream regulators (Fig. S9), consistent with our earlier finding of an increased frequency of activating CTNNB1 mutations in the non-benefit subtype (Fig. S4). These data strongly support the role of β-catenin signaling in promoting resistance to sorafenib. Beyond CTNNB1, other predicted activated upstream regulators included HGF (a ligand for MET), STAT3, and ERBB2 – factors commonly linked to growth factor signaling, survival, and treatment resistance.
These analyses provide important mechanistic insights into the molecular divergence between the KUSS50-defined subtypes. The benefit subtype is characterized by ferroptosis activation and metabolic vulnerability, which may sensitize tumors to sorafenib. In contrast, the non-benefit subtype displays the enrichment of β-catenin signaling and other pro-survival pathways, which may underlie intrinsic resistance. Together, these findings highlight distinct therapeutic vulnerabilities and offer a framework for stratifying patients with HCC based on underlying tumor biology to improve treatment outcomes.
Discussion
In the present study, we conducted a comprehensive analysis of the gene expression data of HCCs in patients who underwent sorafenib treatment. Our analysis led to the identification of the robust 50-gene expression signature, termed the KUSS50, which can accurately identify which patients with HCC are likely to benefit from sorafenib use. The predictive power of the KUSS50 was consistently validated across multiple independent cohorts, highlighting its potential clinical utility in guiding treatment decisions and optimizing patient selection for sorafenib therapy for HCC.
We extensively validated the predictive accuracy of the KUSS50 across two independent cohorts from two phase III clinical trials in which patient data were prospectively collected and analyzed. Notably, the KUSS50 demonstrated high specificity in predicting sorafenib benefit, as demonstrated by several lines of evidence. First, in the IMbrave150 cohort, the Bayesian probability of the KUSS50 exhibited a strong association with clinical outcomes in patients receiving sorafenib but not in those receiving atezolizumab or atezolizumab plus bevacizumab. Second, in the TCGA cohort, in which most patients did not receive sorafenib, the KUSS50 was not associated with prognosis. These findings strongly demonstrate that the KUSS50 can serve as a predictive marker specifically for sorafenib benefit in HCC but lacks prognostic relevance in a general context or predictive power for other treatments. This distinction is crucial for guiding treatment decisions and optimizing patient care, particularly in the context of advanced HCC, for which treatment options are limited.
Our dissection of the molecular characteristics of the KUSS50-defined HCC subtypes produced several intriguing insights. Whereas we observed no significant differences in copy-number alterations between the benefit and non-benefit subtypes, the non-benefit subtype had a higher mutational burden. Notably, mutations of CTNNB1 and AXIN1, components of the Wnt pathway, were enriched in the non-benefit subtype, corroborating previous reports that implicated WNT/β-catenin activation in sorafenib resistance of HCC.[36], [37], [38] Conversely, BAP1 and FAM47A mutations were associated with the benefit subtype, warranting further investigation of their potential roles in modulating sorafenib sensitivity in HCC. Another intriguing finding emerged from the analysis of tumor purity, in which the non-benefit subtype exhibited higher cancer cell purity than the benefit subtype, demonstrating that a stromal cell–cancer cell interaction may influence sorafenib benefit. This observation aligns with the well-documented role of the tumor microenvironment in drug resistance and highlights the potential utility of incorporating tumor purity assessments into models designed to predict sorafenib benefit in HCC.47
Of note, the KUSS50-defined benefit subtype exhibited striking overlap with previously identified genomic subtypes of HCC, such as the iCluster1 subtype from the TCGA, the S1 and S2 subtypes from Hoshida's classification of HCC, and the STEM and ECM subtypes from the INSERM classification, providing further evidence of the biological relevance of KUSS50 in the context of HCC heterogeneity. The convergence of sorafenib sensitivity with specific genomic subtypes of HCC underscores the importance of integrating molecular profiling into clinical practice to tailor treatment strategies to individual patient profiles.
A pathway analysis of the genes in the KUSS50 revealed the activation of several intriguing pathways in the benefit subtype, including the cAMP response element–binding protein, G protein–coupled receptor, stathmin-1, and phagosome formation pathways. Notably, the ferroptosis signaling pathway was highly activated in the benefit subtype. Ferroptosis, a non-apoptotic form of cell death, is characterized by the iron-dependent accumulation of lipid hydroperoxides in the cell membrane to lethal levels.41^,^42 Among the lipid species in the cell membrane, polyunsaturated fatty acids are substrates that are particularly susceptible to lipid peroxidation, and the inhibition of polyunsaturated fatty acid synthesis has been shown to prevent ferroptosis.44^,^45
Notably, ACSL4, a gene that plays a crucial role in processing polyunsaturated fatty acids for membrane incorporation and is necessary for promoting ferroptosis,43 was significantly upregulated in the benefit subtype. This observation aligns with a recent study that demonstrated that the depletion of ACSL4 makes HCC cells resistant to sorafenib,48 and it further supports the hypothesis that elevated ferroptotic activity sensitizes cells to sorafenib-based treatment. Previous studies suggested that exposure to sorafenib can induce ferroptosis by inhibiting the SLC7A11/xCT transporter system,49^,^50 which is essential for intracellular cystine transport. Cystine is a precursor of glutathione, a key reducing metabolite that protects cells from ferroptosis.41^,^42 Interestingly, in a gene set-enrichment analysis, we did not detect increased reactive oxygen species levels associated with the KUSS50 (data not shown). This suggests that ferroptosis activation in the benefit subtype is driven not by heightened oxidative stress per se, but rather by increased susceptibility to lipid peroxidation, possibly due to elevated polyunsaturated fatty acid metabolism. Together, these findings support a model in which the benefit subtype is primed for ferroptosis and suggest that this vulnerability contributes to the enhanced efficacy of sorafenib. Further investigation into how the KUSS50 influences ferroptotic regulation and its therapeutic implications could offer valuable strategies for optimizing HCC treatment.
Despite showing a strong correlation of the KUSS50 with sorafenib benefit in HCC, our study had some limitations. First, the number of patients in the IMbrave150 sorafenib arm was small. This limitation primarily arises from the original genomic study's focus on identifying potential biomarkers of HCC response to the combination of atezolizumab and bevacizumab, resulting in the relatively limited availability of genomic data in the sorafenib arm.20 Further validation of the KUSS50 in a larger cohort of patients given sorafenib is essential to ensure the robustness of the KUSS50 model. Second, the present study utilized data from a genomic sub-study of the IMbrave150 trial, which included only a subset of patients with available genomic profiles. Therefore, the response rate differs from that of the original IMbrave150 trial, and the findings should be interpreted with caution. Validation using large-scale prospective genomic cohorts will be necessary to confirm these results. Third, although we explored associations of the KUSS50 with genomic subtypes of HCC, a deeper understanding of the functional consequences of these associations is necessary. Further investigations are required to elucidate the molecular pathways and processes through which the KUSS50 influences the benefit of sorafenib treatment in HCC. This knowledge could provide valuable information to identify potential therapeutic targets and strategies for improving the efficacy of sorafenib in patients with HCC. Fourth, while the KUSS50 was validated in the IMbrave150 and BIOSTORM cohorts, these cohorts may not fully capture the genetic and ethnic diversity of the global HCC population. Broader validation in more heterogeneous, geographically diverse populations will be necessary to confirm the generalizability and real-world applicability of the KUSS50. Finally, it is worth investigating the hypothesis that the KUSS50 also has predictive value for other multikinase inhibitors, such as regorafenib, which shares overlapping targets with sorafenib. Although promising, this hypothesis requires formal validation in patients treated with regorafenib or related agents to determine whether KUSS50 can serve as a broader predictive biomarker for this drug class.
In conclusion, our study highlights the clinical relevance of transcriptomic signatures in predicting sorafenib treatment outcomes in patients with HCC. The KUSS50 emerged as a promising tool for personalized medicine in HCC by predicting sorafenib benefit. The KUSS50 model is highly accurate and holds potential for optimizing treatment efficacy, minimizing side effects, and improving patient outcomes. Further research is warranted to explore the underlying biology of the KUSS50-defined subtypes of HCC, particularly the role of ferroptosis in HCC. To facilitate clinical translation, we refined the KUSS50 to a streamlined gene panel suitable for rapid and cost-effective platforms such as quantitative reverse-transcription PCR or NanoString. This enables practical application in routine clinical settings, in which gene expression profiling from pretreatment biopsies could be used to identify patients with high KUSS50 Bayesian probability scores (e.g. >0.7), who are more likely to respond favorably to sorafenib. Additionally, addressing the limitations of the KUSS50 model and investigating its applicability to other treatment options and diverse populations are crucial next steps. These findings present new avenues for improving the management of HCC through more precise molecular classification and better treatment guidance.
Abbreviations
HCC, hepatocellular carcinoma; KU, Korea University; KUSS50, Korea University Sorafenib Signature with 50 genes; NCIP, National Cancer Institute proliferation; ROC, receiver-operating characteristic; TCGA, The Cancer Genome Atlas.
Authors’ contributions
J.-S.L., J.H.K., and S.Y.Y. conceived and supervised the study. Y.-S.L., Y.K.J., Y.S.S., H.J.Y., B.S., and H.P. analyzed the data and prepared the figures. H.K., S.-H.K., E.C., Y.J.Y., S.H.K., J.E.Y., K.S.Y., S.H.L., Y.S.J., Y.T., T.H.K., Y.L., and H.L. processed and maintained data for analysis. S.Y.Y., J.H.K., and J.-S.L. wrote the manuscript and prepared the figures. All authors read and approved the final manuscript.
Data availability
The RNA sequencing data generated from the Korea University HCC cohort have been deposited in the NCBI Gene Expression Omnibus (GEO) under accession number [GSE285625](GSE285625). Gene expression and corresponding clinical data from the IMbrave150 and GO30140 clinical trial cohorts were obtained from Genentech and are available through the European Genome-Phenome Archive (EGA) under accession number EGAD00001008128. Data from the BIOSTORM cohort, derived from the biomarker substudy of the phase III STORM trial, are publicly available in the GEO database under accession number [GSE109211](GSE109211).
Financial support
This work was supported by the 10.13039/100000002NIH/10.13039/100000054NCI under award numbers R01CA237327, P50CA217674, and P30CA016672; the 10.13039/100007314Duncan Family Institute for Cancer Prevention and Risk Assessment Seed Funding Research Program at 10.13039/100007313MD Anderson (2016 cycle); institutional bridge funds from 10.13039/100007313MD Anderson (2022 cycle); an Institutional Research Grant from 10.13039/100007313MD Anderson (2021 cycle), to J.-S.L.; funds from The Research Supporting Program of 10.13039/501100016147The Korean Association for the Study of the Liver and The 10.13039/501100016148Korean Liver Foundation, to J.H.K.; and a grant from Industrial Strategic 10.13039/100006180Technology Development Program (20024893, Development of non-thermal dynamic focusing focused ultrasound therapy device), funded by the 10.13039/501100003052Ministry of Trade, Industry and Energy, Korea, to S.Y.Y.
Conflict of interest
The authors declare no conflict of interest.
Please refer to the accompanying ICMJE disclosure forms for further details.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Sung H.Ferlay J.Siegel R.L.Global Cancer Statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries CA Cancer J Clin 7120212092493353833810.3322/caac.21660 · doi ↗ · pubmed ↗
- 2Rumgay H.Arnold M.Ferlay J.Global burden of primary liver cancer in 2020 and predictions to 2040 J Hepatol 772022159816063620884410.1016/j.jhep.2022.08.021PMC 9670241 · doi ↗ · pubmed ↗
- 3European Association for the Study of the Liver EASL clinical practice guidelines: management of hepatocellular carcinoma J Hepatol 6920181822362962828110.1016/j.jhep.2018.03.019 · doi ↗ · pubmed ↗
- 4Cho Y.Kim B.H.Park J.W.Overview of Asian clinical practice guidelines for the management of hepatocellular carcinoma: an Asian perspective comparison Clin Mol Hepatol 2920232522623690757010.3350/cmh.2023.0099 PMC 10121305 · doi ↗ · pubmed ↗
- 5Goh M.J.Sinn D.H.Kim J.M.Clinical practice guideline and real-life practice in hepatocellular carcinoma: a Korean perspective Clin Mol Hepatol 2920231972053660357510.3350/cmh.2022.0404 PMC 10121294 · doi ↗ · pubmed ↗
- 6Budai B.K.Tóth A.Borsos P.Three-dimensional CT texture analysis of anatomic liver segments can differentiate between low-grade and high-grade fibrosis BMC Med Imaging 2020201083295794910.1186/s 12880-020-00508-w PMC 7507285 · doi ↗ · pubmed ↗
- 7Siegel R.L.Giaquinto A.N.Jemal A.Cancer statistics, 2024 CA Cancer J Clin 74202412493823076610.3322/caac.21820 · doi ↗ · pubmed ↗
- 8Llovet J.M.Ricci S.Mazzaferro V.Sorafenib in advanced hepatocellular carcinoma N Engl J Med 35920083783901865051410.1056/NEJ Moa 0708857 · doi ↗ · pubmed ↗
