Machine learning and logistic regression in estimating survival in patients with high-malignant deep-seated soft tissue sarcomas: development and analysis based on a population-based retrospective cohort
Andrea THORN, Jessica A LAVERY, Thomas BAAD-HANSEN, Jonathan A FORSBERG, Michael Mørk PETERSEN, Christina Enciso HOLM

TL;DR
This study compares machine learning and logistic regression models to predict 5-year survival in patients with high-grade soft tissue sarcomas, finding that logistic regression performs better in this population-based cohort.
Contribution
The study develops and evaluates ML models for survival prediction in soft tissue sarcomas using a modern Scandinavian population-based dataset.
Findings
Logistic regression outperformed random forest in AUC, sensitivity, and specificity for 5-year survival prediction.
Trunk location, grade 3 tumors, and early chemotherapy were identified as key negative predictors of survival.
Random forest showed better performance during training but underperformed after internal validation.
Abstract
Soft tissue sarcomas are a heterogeneous group of malignant tumors with a high risk of metastasis, primarily to the lungs, making accurate survival prediction an essential part of long-term planning. No machine learning (ML) survival prediction models have been developed using a modern, population-based dataset from Scandinavia. We aimed to develop and compare ML models with logistic regression in predicting 5-year survival in soft tissue sarcoma patients and identify key predictive variables. This retrospective cohort study included patients diagnosed with deep-seated, high-grade soft tissue sarcomas of the extremities and trunk wall in Denmark from 2000 to 2016. Logistic regression was compared with 4 developed ML models, including random forest. Performance was assessed using the area under the curve (AUC), sensitivity, specificity, and calibration metrics, with a 70:30…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSarcoma Diagnosis and Treatment · Cutaneous Melanoma Detection and Management · Artificial Intelligence in Healthcare and Education
