Radiomics from Routine CT and PET/CT Imaging in Laryngeal Squamous Cell Carcinoma: A Systematic Review with Radiomics Quality Score Assessment
Amar Rajgor, Terrenjit Gill, Eric Aboagye, Aileen Mill, Stephen Rushton, Boguslaw Obara, David Winston Hamilton

TL;DR
This paper reviews how radiomics from CT and PET/CT scans can help predict outcomes in laryngeal cancer, but more standardization is needed.
Contribution
A systematic review and quality assessment of radiomics studies in laryngeal cancer, highlighting key features and methodological gaps.
Findings
Radiomic features like entropy and texture metrics show promise in predicting cancer outcomes.
Most studies used CT, with limited external validation and small sample sizes.
Methodological variability remains a challenge for clinical implementation.
Abstract
This review presents a timely and thorough synthesis of the rapidly evolving field of radiomics as applied to laryngeal squamous cell carcinoma. Radiomics holds real potential as a powerful non-invasive tool for extracting quantitative imaging biomarkers from routine CT and PET/CT scans. These biomarkers offer novel opportunities to improve tumour staging, risk stratification, prognosis, recurrence prediction, and treatment response assessment in laryngeal cancer—areas where current clinical tools are limited. Our analysis of 20 relevant studies reveals consistent radiomic features such as entropy, skewness, and texture-based metrics that demonstrate promising prognostic value across multiple clinical outcomes. We also undertake a formal assessment of methodological quality using the Radiomics Quality Score (RQS). This assessment highlights substantial variability across studies,…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 3| Author, Year, and Theme | Study Aim | Study Type and Design | Imaging | Radiomics Software Used | Exclusively Laryngeal Cancer | Total Patients (n) | Laryngeal Cancer Patients (n) | Primary Treatment | Model | Radiomics Feature | Significant | Model | Limitations | Conclusions |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
| ||||||||||||||
|
| To determine whether CT radiomics can improve T-stage classification (T3 vs. T4) in advanced laryngeal cancer. | Retrospective single-institution cohort with internal validation using a train–test division. | Contrast-enhanced CT | Pyradiomics | Yes | 211 | n = 211 | Surgery ± adjuvant therapy | Internal validation: 70/30 train–test split (n = 150/61) | Retrospective, single centre study. | Integrating CT radiomics with radiologist assessment significantly improved T-stage classification accuracy in advanced laryngeal cancer, demonstrating strong predictive performance in both training and validation cohorts. | |||
|
| To evaluate the potential of CT-based radiomic features in predicting thyroid cartilage invasion in laryngeal cancer. | Retrospective single-institution cohort with internal validation using cross-validation. | Contrast-enhanced CT | Radcloud and Anaconda3 (Python 3.6-based) | Yes | 236 | n = 236 | Not applicable (Diagnostic) | Internal validation: 5-fold cross-validation | Retrospective, single-centre study. | CT radiomics-based models demonstrated high accuracy in predicting thyroid cartilage invasion, outperforming radiologist assessment and showing potential to aid diagnostic decision-making. | |||
|
| To assess whether CT radiomic features (pre-biopsy) can classify T-stage and histological grade in supraglottic laryngeal tumours. | Retrospective single-institution cohort with internal validation using a train–test division. | Contrast-enhanced CT | Pyradiomics | Yes | 20 | n = 20 | Not applicable (Diagnostic) | Internal validation: 80/20 train–test split (n = 16/4) | Retrospective, single-centre study. | CT radiomic features demonstrated moderate discriminatory performance for both T-stage and histological grade classification in supraglottic laryngeal cancer, suggesting potential for non-invasive diagnostic stratification. Validation is necessary on larger data sets. 1 | |||
| Author | Study Aim | Study Type and Design | Imaging Modality | Radiomics Software Used | Exclusively Laryngeal Cancer Patients | Total Patients (n) | Laryngeal Cancer Patients (n) | Primary Treatment | Model Validation Strategy | Radiomics Feature Selection and Signature Construction | Significant Radiomic Features | Model Performance/Statistical Outcomes | Limitations | Conclusions |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
| ||||||||||||||
|
| To evaluate the association between post-CRT 18F-FDG PET radiomics and local tumour control in HNSCC, and to compare two software implementations. | Retrospective single-institution cohort with internal validation using a train–test division | 18F-FDG PET/CT | In-house developed software from MAASTRO and University Hospital Zurich | No | 178 | n = 11 | CRT | Internal validation: 70/30 train–test split (n = 128/50) | Retrospective, single-centre study. | An increased histogram range and elevated GLCM difference entropy are associated with a higher risk of tumour recurrence. Both post-treatment PET-CT radiomic models demonstrated prognostic value for local tumour control and showed comparable performance. | |||
|
| To assess the value of pre-treatment 18F-FDG PET texture analysis in predicting treatment failure in primary HNSCC treated with concurrent CRT. | Retrospective single-institution cohort with internal validation using cross-validation | 18F-FDG PET/CT | CGITA v1.3 | No | 90 | n = 14 | CRT | Internal validation: 10-fold cross-validation | Retrospective, single-centre study. | LILRE from pre-treatment 18F-FDG PET/CT independently predicts local failure in HNSCC patients undergoing CRT. Incorporating texture analysis with clinical variables may improve local control prediction. | |||
|
| To assess the prognostic value of texture features from 18F-FDG PET/CT in a large cohort of HNSCC patients across all subsites and stages. | Retrospective single-institution cohort with internal validation using cross-validation | 18F-FDG PET/CT | LIFEx | No | 284 | n = 32 | Mixed Modalities | Internal validation: 30-fold cross-validation | Retrospective, single-centre study. | MTV and GLCM correlation derived from pre-treatment 18F-FDG PET/CT were independent prognostic factors for overall survival in patients with head and neck squamous cell carcinoma. | |||
|
| To evaluate whether combining pre- and post-treatment 18F-FDG PET/CT radiomics with clinical data improves prognostic accuracy in laryngeal and hypopharyngeal cancer. | Retrospective single-institution cohort with internal validation using a train–test division | 18F-FDG PET/CT | PET Edge (MIM v7.1.7) for VOI segmentation; CGITA toolbox (via MATLAB 2012a) for radiomic feature extraction, following IBSI guidelines. | No | 91 | n = 57 | CRT | Internal validation: 70/30 train–test split (n = 61/30; includes hypopharyngeal cancer) | Retrospective, single-centre study. | Combining delta radiomic features from pre- and post-treatment 18F-FDG PET/CT with clinical data significantly improved prognostic performance for both progression-free and overall survival in laryngeal and hypopharyngeal cancer. | |||
|
| To assess whether 18F-FDG PET/CT radiomic features can predict survival and disease progression in laryngeal cancer. | Retrospective single-institution cohort with internal validation using a train–test division | 18F-FDG PET/CT | LIFEx | Yes | 49 | n = 49 (T1–2: 20, T3–4: 29) | Mixed Modalities | Internal validation: 70/30 train–test split (n = 34/15) | Retrospective, single-centre study. | Machine learning models using 18F-FDG PET radiomic and clinical features demonstrated strong predictive performance for disease progression and progression-free survival in laryngeal cancer patients. | |||
|
| To investigate the value of pre-treatment 18F-FDG PET radiomics in modelling local tumour control. | Retrospective single-institution cohort with internal validation using a train–test division | 18F-FDG PET/CT and Contrast-enhanced CT | In-house developed software (Python-based) | No | 172 | n = 10 | CRT | Internal validation: 70/30 train–test split (n = 121/51) | Retrospective, single-centre study. | Tumours exhibiting more homogeneous CT density and concentrated areas of high FDG uptake were associated with better prognosis. Radiomic analyses from both CT and PET demonstrated similarly strong ability to discriminate local tumour control in HNSCC. | |||
|
| Radiomics combined with clinical data improves prediction of recurrence risk in head and neck cancer. Specific texture features from PET/CT are associated with locoregional recurrence and distant metastases, demonstrating good predictive performance in external validation cohorts. | Retrospective multicentre cohort divided into training and external validation cohorts | 18F-FDG PET/CT and Contrast-enhanced CT | In-house developed software (MATLAB-based) | No | 300 | n = 45 | CRT | Internal and external validation: Training cohort (n = 194); external validation on independent cohort (n = 106) | Retrospective study. | Radiomics offers valuable prognostic insights for locoregional recurrence and distant metastases in head and neck cancer. | |||
|
| ||||||||||||||
|
| To evaluate the association between CT radiomics and overall survival in locally advanced HNSCC patients treated with induction chemotherapy. | Retrospective single-institution study; no internal or external validation. | Contrast-enhanced CT | TexRad | No | 72 | n = 21 | Induction Chemotherapy ± Definitive Treatment | No formal validation: Multivariate Cox regression | Retrospective, single-centre study. | CT radiomic features, specifically entropy and skewness, alongside clinical factors including tumour size and nodal stage, were independently associated with overall survival in locally advanced HNSCC patients treated with induction chemotherapy. No formal validation was performed. | |||
|
| To evaluate the prognostic value of radiomics in locally advanced HNSCC treated with CRT or BRT. | Retrospective single-institution cohort with internal validation using cross-validation. | Contrast-enhanced CT | Oncoradiomics (MATLAB-based) | No | 120 | n = 18 | CRT or bioradiotherapy | Internal validation: 10-fold cross-validation | Retrospective, single-centre study. | A radiomics signature derived from CT features significantly predicted overall and progression-free survival in locally advanced HNSCC treated with CRT or BRT. Combining radiomics with p16 status improved prognostic accuracy, and patients with high radiomic scores showed greater benefit from CRT over BRT. | |||
|
| To assess the prognostic value of radiomic texture features from pre-treatment CT in HNSCC patients treated with CRT. | Retrospective single-institution study; no internal or external validation. | Contrast-enhanced CT | In-house developed software (MATLAB-based) | No | 62 | n = 19 | CRT | No formal validation: Univariate and multivariate Cox regression | Retrospective, single-centre study. | Pre-treatment CT radiomic texture features independently predicted local failure in HNSCC. Key predictors included histogram and GLRLM metrics. Findings highlight potential for non-invasive risk stratification, though no external validation was conducted. | |||
| To evaluate the prognostic value of a CT-based radiomics signature and nomogram in patients with laryngeal cancer following surgical resection. | Retrospective single-institution cohort with internal validation using a train–test division. | Contrast-enhanced CT | LIFEx | Yes | 136 | n = 136 (T1–2: 83, T3–T4: 53) | Surgery ± adjuvant therapy | Internal validation: Train–test split (n = 96/40) | Retrospective, single-centre study. | Integrating radiomic features into a prognostic nomogram significantly improved overall survival prediction accuracy in laryngeal cancer patients post-surgery, outperforming both clinical-only models and AJCC staging. | ||||
|
| To determine whether pre-treatment CT texture features can predict long-term local control and laryngectomy-free survival in locally advanced laryngopharyngeal carcinoma. | Retrospective single-institution cohort with internal validation using cross-validation.. | Contrast-enhanced CT | Pyradiomics | No | 60 | n = 31 | CRT | Internal validation: 10-fold cross-validation | Retrospective, single-centre study. | Medium-filtered CT texture features, particularly entropy, were significant independent predictors of both local control and laryngectomy-free survival in patients with locally advanced laryngopharyngeal carcinoma. | |||
|
| To investigate whether CT radiomics of peritumoural tissue can predict overall survival, locoregional recurrence, and distant metastases in advanced HNSCC treated with CRT. | Retrospective multicentre cohort combining DESIGN and BD2Decide datasets with internal validation using a train–test division. | Contrast-enhanced CT | RadiomiX Discovery Toolbox (Oncoradiomics) | No | 444 | n = 57 | CRT | Internal validation: 100-repeat 2-fold cross-validation | No features reached statistical significance | Retrospective study. | Radiomic features from the peritumoral regions are not useful for the prediction of time to OS, LR, and DM. | ||
|
| To develop and validate a CT-based radiomics signature for predicting locoregional control in HNSCC patients treated with primary CRT. | Retrospective multicentre cohort with internal cross-validation and external validation. | Contrast-enhanced CT | MIRP (Medical Imaging Radiomics Processor), Python-based | No | 318 | n = 8 | CRT | Internal and external validation: 3-fold cross-validation (n = 233); | Retrospective study. | The final signature combined tumour volume with two independent radiomic features, achieving moderate discriminatory performance for predicting locoregional control in a validation cohort. | |||
|
| To develop a radiomics nomogram for predicting pathological response and overall survival after induction chemotherapy in advanced laryngeal cancer. | Retrospective single-institution cohort with internal validation using a train–test division. | Contrast-enhanced CT | 3D-Slicer | Yes | 114 | n = 114 (T1–2: 0, T3–4: 114) | Mixed Modalities | Internal validation: 70/30 train–test split (n = 81/33) | Retrospective, single-centre study. | Radiomics-enhanced nomogram moderately improved overall survival prediction in advanced laryngeal cancer, demonstrating potential for non-invasive, individualized treatment planning. | |||
|
| To evaluate whether CT perfusion and radiomic features from pre- and post-treatment imaging can predict one-year disease-free survival in laryngeal and hypopharyngeal cancer. | Retrospective secondary analysis of a phase II trial with internal validation via cross-validation. | Contrast-enhanced CT and CT perfusion | In-house developed software | No | 44 | n = 42 (T1–2: 0, T3–4: 42) | Induction Chemotherapy ± Definitive Treatment | Internal validation: Two-loop leave-one-out cross-validation | Retrospective, single-centre study. | Combined CT perfusion and radiomic features modestly improved prediction of one-year disease-free survival compared to laryngoscopic assessment. | |||
|
| To evaluate whether a CT-based radiomics signature can predict clinical outcomes following CRT in stage III–IV HNSCC. | Retrospective single-institution cohort with internal validation using a train–test division. | Non-contrast CT | LIFEx | No | 110 | n = 11 | CRT | Internal validation: Train–test split (n = 70/40) | Retrospective, single-centre study. | CT-based radiomics signature demonstrates strong predictive ability for overall survival, progression-free survival, and local control in advanced HNSCC patients treated with CRT. This non-invasive tool may enhance risk stratification and support personalized treatment planning. 1 | |||
- —National Institute for Health and Care Research
- —Newcastle University
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiomics and Machine Learning in Medical Imaging · Head and Neck Cancer Studies · Cholangiocarcinoma and Gallbladder Cancer Studies
1. Introduction
Laryngeal cancer represents a complex clinical entity, particularly in its advanced stages, where curative treatment options often carry significant trade-offs in terms of function and quality of life. In recent years, radiomics has emerged as a promising avenue for individualised care. Radiomics refers to the extraction of quantitative data from medical imaging, including computed tomography (CT), magnetic resonance imaging (MRI), and positron emission tomography (PET). These features can describe tumour morphology, texture, and intensity patterns that may reflect underlying histological or molecular characteristics of the disease [1]. When incorporated into predictive or prognostic models, often alongside clinical, demographic or genomic variables, radiomic features can serve as imaging biomarkers to support more individualised care. Figure 1 illustrates the standard radiomics pipeline and the process by which a model is developed.
Radiomics offers the potential to act as a ‘virtual biopsy’, providing non-invasive, repeatable assessments of the whole tumour and its microenvironment [2]. It may also assist in stratifying patients by risk of recurrence, resistance to chemoradiotherapy (CRT), or early treatment failure, areas where conventional assessment tools have shown limited predictive accuracy.
There is increasing availability of open source radiomics software, consensus guidelines on image standardisation, and growing interest in artificial intelligence applications in oncology. As clinical trials and multi-institutional datasets become more accessible, it is timely to consolidate current findings, assess reproducibility, and identify translational pathways to support integration into clinical workflows.
Despite this progress, the radiomics evidence base specific to laryngeal squamous cell carcinoma remains relatively limited with many studies aggregating multiple head and neck subsites despite known differences in tumour biology, treatment paradigms, and imaging phenotypes. A laryngeal-specific synthesis is therefore required to avoid inappropriate generalisation and to critically evaluate the strength and limitations of the available evidence. Moreover, although radiomics has been explored across a range of imaging modalities, this review deliberately focuses on radiomic analyses derived from routine CT and PET/CT imaging, which form the backbone of contemporary laryngeal cancer management. Imaging techniques such as endoscopic or optical approaches, which rely on fundamentally different acquisition paradigms and analytical frameworks, were therefore considered outside the scope of this radiomics-focused synthesis.
Accordingly, this review aims to evaluate the quality, consistency, and clinical relevance of the existing evidence, identify key methodological and biological patterns across studies, and critically appraise the limitations that currently constrain translation of radiomics into routine clinical practice for laryngeal cancer [3].
2. Materials and Methods
2.1. Data Sources and Searches
This study was registered on PROSPERO (ID: CRD420251117983). The PRISMA guideline for reporting of systematic reviews [4] was used to perform a systematic literature search in the electronic biomedical databases of MEDLINE and EMBASE using the Ovid interface. The search strategy was designed to identify original research studies that evaluated radiomic features extracted from clinical imaging in relation to laryngeal cancer outcomes. Boolean logic was used to combine disease-related keywords (“laryngeal cancer” OR “larynx” OR “larynx cancer”) with informatics-related terms (“radiomics” OR “texture analysis” OR “machine learning” OR “artificial intelligence”). The full electronic search strategy is provided in Supplementary S1.
The search was limited to English-language articles published between 1 January 2010 and 31 January 2024. This window was chosen to reflect the timeframe during which radiomics has become increasingly adopted within oncological research, whilst allowing breadth for the inclusion of early exploratory studies. Duplicate entries were removed, and additional eligible articles were identified through backward citation-tracking of included papers and relevant systematic reviews. Ethical approval was not required for this review.
2.2. Study Selection
Eligible studies met the following criteria if they, as follows:
- Included human patients diagnosed with laryngeal squamous cell carcinoma (LSCC) (or a clearly defined subsite within a broader head and neck cohort).
- Employed radiomic feature extraction from CT, PET/CT, or MRI scans.
- Investigated outcomes related to diagnosis, staging, survival, recurrence, treatment response, or prognosis.
- Used a defined and reproducible image analysis pipeline.
- Provided statistical evaluation of radiomic model performance (e.g., AUC, concordance index, and accuracy).
- Were original, peer-reviewed full-text articles published in English.
Exclusion criteria included case reports, abstracts, editorials, reviews, or conference proceedings, exclusive focus on preclinical or animal models, a lack of a clear radiomics methodology, or if they did not include imaging-based feature extraction.
2.3. Data Extraction and Risk of Bias Assessment
Following database searching, all records were imported into EndNote for deduplication. Two reviewers independently screened titles and abstracts for relevance. Full texts were then retrieved for all potentially eligible articles, and inclusion was confirmed based on predefined criteria. Discrepancies were resolved by consensus. The screening and selection process is illustrated in Figure 2.
A structured data extraction template was developed, incorporating key study attributes including author and year, radiomics software used, imaging modality, study aim and design, patient cohort characteristics, modelling strategy, statistical methods, significant radiomic features, and outcome metrics. Where relevant, the presence of external validation, compliance with Image Biomarker Standardisation Initiative (IBSI) recommendations [5], and model interpretability were also recorded.
2.4. Radiomics Quality Score Assessment
To support critical appraisal as part of the review design, formal methodological quality assessment was undertaken using the Radiomics Quality Score (RQS) [1]. Each included study was evaluated across predefined RQS domains encompassing imaging protocol reporting, feature robustness, validation strategy, statistical analysis, clinical utility, and transparency. Scoring was performed conservatively based on explicit reporting within each publication, with unreported criteria scored as zero; where applicable, negative scores were assigned in accordance with the RQS framework. Two reviewers independently performed the quality assessment, with discrepancies resolved by consensus. In addition to formal scoring, narrative synthesis was used where appropriate to provide contextual interpretation of study robustness, heterogeneity, translational relevance, and methodological limitations across the body of evidence.
3. Results
3.1. Study Characteristics
A total of 20 eligible studies were included in this review following full-text screening. They varied in imaging modality, radiomics software, modelling objective, sample size, and validation strategy. The most common imaging modality was CT, used in 13 studies. Seven studies used 18F-FDG PET/CT, either as the sole modality or alongside CT-derived radiomics. One study incorporated CT perfusion imaging alongside conventional radiomics. Six studies focused exclusively on laryngeal cancer cohorts, only three of which examined prognostic outcomes as their primary objective.
Given the heterogeneity of study designs, populations, and imaging pipelines, the studies were synthesised according to their primary modelling objective. This approach was selected to ensure alignment with the translational focus of this review and to enable meaningful comparison across imaging techniques. The four categories used for synthesis were as follows:
- Tumour staging and histological grading.
- Survival prediction (overall and disease-specific).
- Recurrence and progression modelling.
- Treatment response prediction, including CRT failure.
Studies that addressed more than one endpoint were categorised according to their primary outcome of interest. Table 1 and Table 2 are intended as detailed reference summaries, providing consolidated methodological and clinical information from each included study to support transparency and enable comparison across radiomics pipelines, imaging modalities, and outcomes. In addition to descriptive study-level reporting, the synthesis prioritised cross-study aggregation of recurring radiomic feature classes, evaluation practices (e.g., discrimination, calibration, and validation), and methodological limitations affecting reproducibility and generalisability. The methodological quality of each study was formally assessed using the RQS and is reported in Figure 3.
3.2. Radiomics for Tumour Staging and Histopathological Grading
Wang et al. sought to classify T3- versus T4-stage tumours in a cohort of 211 patients with advanced laryngeal SCC. Using contrast-enhanced CT and Pyradiomics, they extracted a broad range of first-order, shape and wavelet-transformed features. Following feature selection with least absolute shrinkage and selection operator (LASSO), which is a statistical approach that prevents overfitting, a support vector machine (SVM) model was developed. The radiomics-based classifier demonstrated excellent predictive accuracy (AUC of 0.892 in the validation cohort) [6]. Combining radiomics with radiologist interpretation improved performance further, highlighting the potential for these tools to complement clinical expertise. However, the study’s retrospective design and lack of external validation limit its generalisability.
Guo et al. investigated the prediction of thyroid cartilage invasion, a key criterion for defining T4a disease. In a cohort of 236 patients, they developed logistic regression models using radiomic features extracted from contrast-enhanced CT, implementing class-balancing and rigorous cross-validation. Their model achieved an AUC of 0.905, significantly outperforming junior radiologist assessments [7]. This reinforces the value of quantitative imaging for detecting subtle anatomical invasion that may be missed on routine interpretation. As with other work in this space, variability in scanner types and lack of external validation remain important considerations.
Rao et al. conducted an exploratory study evaluating the role of radiomics in T-stage classification and histopathological tumour grading. Their analysis, restricted to 20 supraglottic laryngeal tumours, applied a minimum redundancy–maximum relevance approach followed by SVM modelling. Although limited by sample size, the study demonstrated moderate discriminatory ability (AUCs close to 0.79 for staging and 0.69 for grade classification) [8]. These findings suggest potential for radiomics to reflect underlying tumour biology, but also underscore the need for validation in larger, well-characterised cohorts.
3.3. Radiomics for Survival Prediction and Recurrence
A total of 14 studies identified in this review addressed prognostic modelling, incorporating endpoints such as overall survival (OS), progression-free survival (PFS), disease-free survival (DFS), and locoregional recurrence (LRR). Imaging modalities included contrast-enhanced CT and 18F-FDG PET/CT, with most studies using retrospective designs and internal validation. The studies are summarised in Table 2 and discussed below. Across these studies, reported prognostic performance was frequently derived from retrospective cohorts with internal validation only, limiting the robustness and generalisability of many proposed models despite apparently strong discrimination metrics.
The earliest prognostic investigation amongst the included studies was conducted by Zhang et al., who analysed CT-based first-order features in 72 head and neck SCC patients (21 laryngeal). Texture entropy and skewness were found to be independently associated with OS, with multivariate hazard ratios (HR) of 2.10 and 3.67, respectively [9]. Whilst promising, the study lacked external validation and was limited by its heterogeneous population.
Ou et al. similarly developed a CT-based radiomics signature in 120 patients (18 laryngeal), incorporating 24 radiomic features with principal component analysis-based dimensionality reduction. Their model predicted both OS and PFS with 5-year AUCs of 0.78 when combined with p16 status, demonstrating added value over clinical variables alone [10]. Notably, patients with high radiomic risk scores showed significantly better outcomes with CRT compared to bioradiotherapy, suggesting a role for treatment stratification.
Bogowicz et al. evaluated radiomic features from both PET and CT scans. Their combined PET-CT model achieved a C-index of 0.77 for local control in training and 0.73 in validation. Grey-level size zone matrix (GLSZM)-derived features such as size zone entropy (CT) and small zone low grey level emphasis (PET) emerged as significant predictors [11]. The study was notable for directly comparing imaging modalities and highlighting complementary prognostic value.
PET-based prognostic work was expanded by Feliciani et al. and Guezennec et al. The former used an open-source package “CGITA” and extracted features in 90 patients (14 laryngeal), identifying grey-level run-length matrix (GLRLM) low intensity long run emphasis (LILRE) as an independent predictor of local failure, with a C-index of 0.76 for the radiomics model versus 0.65 for clinical features alone [12]. Guezennec et al., in a larger cohort of 284 patients (32 laryngeal), showed that metabolic tumour volume (MTV) and grey-level co-occurrence matrix (GLCM) correlation were independently associated with OS, with multivariate HRs of 2.01 and 4.51, respectively. AUCs for MTV and GLCM correlation were 0.68 and 0.66, respectively, highlighting moderate prognostic value [13].
A multicentre study by Vallières et al. validated PET-CT radiomics signatures for multiple endpoints including OS, LRR, and distant metastasis. GLSZM and GLRLM features were repeatedly identified across endpoints, reinforcing their prognostic relevance. However, the cohort included only 45 laryngeal cancer patients [14].
More recent studies have applied delta-radiomics and hybrid modelling approaches. Choi et al. used paired pre- and post-treatment PET/CT scans to compute delta-radiomic scores (Rad-scores) in 91 patients (57 laryngeal). Their combined clinical–radiomic model achieved a C-index of 0.958 for OS and 0.889 for PFS. High prognostic weight was attributed to SUV variance, co-occurrence metrics, and skewness-related features [15].
Nakajo et al. focused exclusively on 49 laryngeal cancer patients, combining PET-derived GLCM entropy and GLZLM zone length non-uniformity (ZLNU) with clinical features in a random survival forest model. Their model achieved strong performance (C-indices of 0.808–0.840 in internal validation) [16]. Similarly, Kang et al. created a CT radiomics-enhanced nomogram for advanced laryngeal cancer, achieving validation AUCs of 0.735 (1-year OS) and 0.746 (3-year OS), with important features including GLSZM size zone non-uniformity and neighbouring grey tone difference matrix (NGTDM) complexity [17].
Among studies incorporating both radiomics and perfusion imaging, Woolen et al. evaluated delta radiomic features from paired pre- and post-treatment scans to predict one-year disease-free survival. While overall discrimination was modest (validation AUC = 0.69), the imaging-based model demonstrated greater prognostic discrimination than laryngoscopic assessment of treatment response alone (AUC = 0.40). Although laryngoscopy is not used in isolation for prognostication, it remains essential for direct visual assessment and clinical decision-making following definitive treatment. These findings therefore support a complementary role for quantitative imaging biomarkers alongside standard evaluation [18].
Some studies also examined local control or laryngectomy-free survival. Agarwal et al. used medium-filtered CT texture features such as entropy and kurtosis to predict these outcomes, reporting entropy as an independent predictor with p < 0.001 [19]. Meneghetti et al. performed external validation of their radiomics signature in 85 patients, reporting moderate C-indices of 0.66 for local control. Despite low laryngeal numbers, this study offered one of the few multicentre validations [20].
3.4. Radiomics for Treatment Response Prediction and Failure
Bogowicz et al. developed a model using post-treatment PET/CT radiomics to predict local tumour control. Although the cohort was heterogeneous (11 laryngeal), the radiomics models based on histogram range and GLCM difference entropy achieved promising C-indices of 0.71–0.73 in validation, suggesting meaningful signal even after treatment. A second study by the same group, combining CT and PET radiomics pre-treatment, reported comparable performance (C-index of 0.73 for the combined model), again highlighting the value of GLSZM-derived features such as small zone low grey level emphasis [11].
Feliciani et al. evaluated whether pre-treatment PET radiomics could predict local failure after CRT. Their model, incorporating GLRLM LILRE, outperformed clinical models (C-index of 0.76 versus 0.65) and identified LILRE as an independent predictor in multivariate analysis. Although there were 14 laryngeal cancer patients, the study underscored the potential of radiomics for early failure prediction in a CRT context [12].
Similarly, Choi et al. explored the use of delta-radiomic features derived from pre- and post-treatment PET/CT scans. In a mixed cohort including 57 laryngeal cancer patients, they developed Rad-scores that were significantly associated with both PFS and OS. Their combined clinical radiomic model achieved very high C-indices (0.889 for PFS and 0.958 for OS) [15]. Although the primary endpoints were survival-based, the use of treatment-induced radiomic change aligns this study more closely with treatment response modelling.
Nakajo et al. focused exclusively on 49 laryngeal cancer patients. They developed a model predicting both disease progression and PFS. Radiomic features such as GLCM entropy and GLZLM ZLNU were consistently selected, and the resulting model achieved strong performance with AUCs and C-indices exceeding 0.80. These results provide encouraging evidence that PET-based radiomics can identify patients at higher risk of progression early in their treatment pathway [16].
3.5. Methodological Quality Assessment (Radiomics Quality Score)
Methodological quality of the included studies was formally assessed using RQS (version 1.0), with results summarised in Figure 3. Total RQS scores ranged from 0% to 64% of the maximum possible score, indicating substantial variability in methodological rigour across the literature. The majority of studies achieved low to moderate scores, indicating that methodological limitations remain common and that many radiomics models are not yet ready for clinical translation.
Assessment across individual RQS domains revealed consistent patterns. Most studies reported imaging acquisition parameters and applied feature reduction or multiple testing strategies. In contrast, key methodological limitations were frequently observed, including absence of test–retest or phantom analyses, limited use of independent external validation cohorts, and a lack of formal clinical utility or decision-impact analyses. Higher RQS scores were typically achieved by studies incorporating external validation and more comprehensive statistical reporting, whereas studies without validation or robustness assessment generally demonstrated lower overall scores.
Methodological quality assessment of included studies [6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25] using the Radiomics Quality Score. Individual RQS components are displayed for each study, with colour coding indicating criteria met (green), criteria partially met or not fully reported (orange), and criteria not met or subject to negative scoring where methodological penalties were applied (red); grey denotes criteria that were not applicable. Numbers in parentheses indicate the maximum possible score for each RQS component. Total RQS scores are presented as a percentage of the maximum possible score, illustrating variability in methodological quality across the literature.
4. Discussion
4.1. Synthesis of Key Findings
This review synthesised 20 radiomic studies focusing on laryngeal cancer or mixed cohorts containing laryngeal subsites, grouped according to modelling objective. Across staging, survival prediction, recurrence modelling, and treatment response, radiomics has consistently shown potential to extract prognostically relevant imaging features from standard-of-care CT and PET/CT scans.
Several radiomic features were repeatedly identified as predictive across studies. Entropy and skewness, both first-order features reflecting intensity distribution and heterogeneity, were among the most frequently selected [6,7,8,11,15,16,26]. Second-order features derived from GLCM and GLSZM matrices also emerged consistently, irrespective of imaging modality or disease endpoint [6,7,8,10,11,13,14,16,17,24,25]. This cross-study recurrence suggests a potential underlying biological relevance, possibly reflecting tumour complexity, aggressiveness, and response to therapy. Higher entropy and skewness are likely to reflect increased intratumoural heterogeneity, necrosis, and hypoxic burden, biological features that have been consistently associated with aggressive tumour behaviour, treatment resistance, and poorer prognosis in laryngeal squamous cell carcinoma.
Clinically, the consistent selection of certain features across modalities and endpoints supports their potential as imaging biomarkers. Risk stratification using radiomics signatures could assist with decisions around treatment intensification, organ preservation, or surveillance intensity, provided these tools are interpretable, externally validated, and embedded in clinical workflows.
4.2. Strengths and Limitations of the Current Evidence
Formal assessment using the RQS provides an objective framework for interpreting the strengths and limitations of the current evidence base. As demonstrated in Figure 3, methodological quality varied widely across studies, with most achieving low to moderate RQS scores. Higher scores were observed in a small number of studies, typically characterised by more comprehensive reporting of imaging protocols, robust feature reduction strategies, and incorporation of independent external validation cohorts.
Several strengths are evident in the evolving literature. Methodologically, more recent studies, such as those included in the review, demonstrate increasing rigour. The use of machine learning classifiers, such as random forests, SVMs, and LASSO, reflects a growing sophistication in modelling. Importantly, some studies now incorporate delta-radiomics [15,18], tracking feature evolution across treatment timepoints; a promising direction for predicting response and adapting therapy [27].
Additionally, performance evaluation has matured. Some studies reported not only AUCs or C-indices but also calibration [11], risk stratification thresholds [15,21,24], and, in a few cases, external validation [14,20]. Several adopted consensus segmentation approaches, and a handful followed IBSI guidelines [5].
Despite this progress, significant limitations remain. Cohort size is a persistent issue. Only 6/20 studies focused exclusively on laryngeal cancer [6,7,8,16,17,22], and many involved fewer than 50 laryngeal patients [8,10,11,12,14,16,18,20,21,24,25,26]. Small sample sizes limit statistical power and increase the risk of overfitting, particularly in high-dimensional radiomic feature spaces [28]. Furthermore, the use of mixed head and neck cancer cohorts without laryngeal-specific subgroup analysis introduces biological and clinical heterogeneity, as tumour behaviour, treatment decisions, and prognostic implications vary significantly between subsites [29]. Importantly, these associations likely reflect fundamental differences in tumour biology across head and neck subsites. Laryngeal squamous cell carcinoma is predominantly smoking-related, whereas oropharyngeal cancers are frequently virally driven (HPV-associated) and nasopharyngeal cancers are commonly linked to Epstein–Barr virus infection. These distinct carcinogenic pathways are associated with differences in tumour microenvironment, cellular heterogeneity, angiogenesis, immune infiltration, and growth patterns, which are plausibly captured by radiomic texture and intensity features. These differences in tumour microenvironment and growth patterns may underlie observed associations between radiomic heterogeneity metrics and adverse clinical outcomes. As such, radiomic signatures derived from laryngeal tumours should not be assumed to generalise across virally driven head and neck cancers, reinforcing the need for subsite-specific model development and validation [29].
There is also notably variability in image acquisition and segmentation. Differences in scanner type, reconstruction parameters, and contrast use can affect reproducibility. While some studies addressed this with resampling or harmonisation strategies, few applied techniques such as ComBat [30] outperformed phantom calibration [31]. Manual segmentation, often performed by a single observer, was the norm, with limited assessment of inter-observer variability or reproducibility.
Most studies used internal validation only, such as train–test split or k-fold cross-validation, and external validation was rare [14,20]. None of the studies included prospective validation. This constrains generalisability and inflates reported model performance [32]. Furthermore, several studies failed to report calibration metrics or decision curve analyses, limiting the assessment of clinical utility.
4.3. Limitations of This Review
This review employed a structured narrative synthesis, which, while appropriate for addressing methodological and clinical heterogeneity, precluded formal meta-analysis. Although a formal methodological quality assessment was undertaken using the Radiomics Quality Score, this framework does not capture all potential sources of bias, and findings should therefore be interpreted in conjunction with the qualitative appraisal presented. Only English-language publications were included, which may introduce selection bias.
4.4. Clinical Applicability and Generalisability
Radiomics holds considerable promise as a non-invasive biomarker in laryngeal cancer, with potential applications in prognosis, treatment selection, and response monitoring. However, as highlighted by the overall RQS assessment, limited external validation, reproducibility testing, and formal clinical utility analyses currently constrain translation into routine clinical practice. Standardisation across the radiomics pipeline remains a foundational requirement. Differences in imaging protocols, segmentation methods, feature extraction, and modelling approaches contribute to inconsistent results and hinder reproducibility. Adoption of consensus frameworks such as those provided by the IBSI [5], along with compliance with reporting guidelines like TRIPOD [33] and CLAIM [34], is essential to improve transparency and reproducibility across studies.
Larger, harmonised, multicentre datasets are needed to evaluate model generalisability and support regulatory progression. Initiatives such as the AIRSPACE project, a mixed retrospective and prospective cohort based across centres in the northern UK, may provide an opportunity to evaluate radiomics models within more standardised imaging pipelines and to support independent validation when fully established. Importantly, such efforts should be viewed as complementary to, rather than a substitute for, rigorous external validation across diverse healthcare settings [3].
5. Conclusions
Radiomics has shown promising utility as a non-invasive biomarker in laryngeal cancer, with applications spanning tumour staging, prognostic modelling, and prediction of treatment response. However, widespread clinical adoption remains limited by methodological inconsistencies, small cohort sizes, and lack of standardisation. Future efforts must prioritise robust validation, integration with clinical and molecular data, and alignment with real-world workflows.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Lambin P. Leijenaar R.T.H. Deist T.M. Peerlings J. De Jong E.E.C. Van Timmeren J. Sanduleanu S. Larue R.T.H.M. Even A.J.G. Jochems A. Radiomics: The Bridge between Medical Imaging and Personalized Medicine Nat. Rev. Clin. Oncol.20171474976210.1038/nrclinonc.2017.14128975929 · doi ↗ · pubmed ↗
- 2Limkin E.J. Sun R. Dercle L. Zacharaki E.I. Robert C. ReuzéS. Schernberg A. Paragios N. Deutsch E. FertéC. Promises and Challenges for the Implementation of Computational Medical Imaging (Radiomics) in Oncology Ann. Oncol.2017281191120610.1093/annonc/mdx 03428168275 · doi ↗ · pubmed ↗
- 3Rajgor A. Radiomics-Risk Model for Larynx Cancer (AIRSPACE)Available online: https://research.ncl.ac.uk/airspace/(accessed on 10 October 2025)
- 4Page M.J. Mc Kenzie J.E. Bossuyt P.M. Boutron I. Hoffmann T.C. Mulrow C.D. Shamseer L. Tetzlaff J.M. Akl E.A. Brennan S.E. The PRISMA 2020 Statement: An Updated Guideline for Reporting Systematic Reviews BMJ 2021372 n 7110.1136/bmj.n 7133782057 PMC 8005924 · doi ↗ · pubmed ↗
- 5Zwanenburg A. Vallières M. Abdalah M.A. Aerts H.J.W.L. Andrearczyk V. Apte A. Ashrafinia S. Bakas S. Beukinga R.J. Boellaard R. The Image Biomarker Standardization Initiative: Standardized Quantitative Radiomics for High-Throughput Image-Based Phenotyping Radiology 202029532833810.1148/radiol.202019114532154773 PMC 7193906 · doi ↗ · pubmed ↗
- 6Wang F. Zhang B. Wu X. Liu L. Fang J. Chen Q. Li M. Chen Z. Li Y. Dong D. Radiomic Nomogram Improves Preoperative T Category Accuracy in Locally Advanced Laryngeal Carcinoma Front. Oncol.20199106410.3389/fonc.2019.0106431681598 PMC 6803547 · doi ↗ · pubmed ↗
- 7Guo R. Guo J. Zhang L. Qu X. Dai S. Peng R. Chong V.F.H. Xian J. CT-Based Radiomics Features in the Prediction of Thyroid Cartilage Invasion from Laryngeal and Hypopharyngeal Squamous Cell Carcinoma Cancer Imaging 2020208110.1186/s 40644-020-00359-233176885 PMC 7661189 · doi ↗ · pubmed ↗
- 8Rao D. Koteshwara P. Singh R. Jagannatha V. Exploring Radiomics for Classification of Supraglottic Tumors: A Pilot Study in a Tertiary Care Center Indian J. Otolaryngol. Head Neck Surg.20237543343910.1007/s 12070-022-03239-237275092 PMC 10235219 · doi ↗ · pubmed ↗
