# Developing a Predictive Model for Significant Prostate Cancer Detection in Prostatic Biopsies from Seven Clinical Variables: Is Machine Learning Superior to Logistic Regression?

**Authors:** Juan Morote, Berta Miró, Patricia Hernando, Nahuel Paesano, Natàlia Picola, Jesús Muñoz-Rodriguez, Xavier Ruiz-Plazas, Marta V. Muñoz-Rivero, Ana Celma, Gemma García-de Manuel, Pol Servian, José M. Abascal, Enrique Trilla, Olga Méndez

PMC · DOI: 10.3390/cancers17071101 · Cancers · 2025-03-25

## TL;DR

This study compares machine learning and logistic regression for predicting significant prostate cancer and finds both methods perform similarly well.

## Contribution

Demonstrates that machine learning and logistic regression are equally effective for sPCa prediction with specific performance trade-offs.

## Key findings

- Both models achieved high accuracy in predicting significant prostate cancer.
- Machine learning showed higher recall, while logistic regression had higher precision and specificity.
- Both models reduced unnecessary biopsies by about 27-29% at 95% sensitivity.

## Abstract

Prostate cancer (PCa) detection remains a critical area of research, with an ongoing need for predictive tools that accurately identify significant PCa (sPCa) while decreasing unnecessary prostate biopsies and the overdetection of insignificant tumors. Risk calculators based on predictive models are among the most valuable tools, as they can individualize the likelihood of sPCa with high accuracy at no cost. Machine learning algorithms are the modern preferred methods for developing predictive models, especially when managing big data. However, it remains unclear whether machine learning is superior to traditional logistic regression. In this study, we demonstrate that both algorithms proved similarly effective on a limited dataset.

Objective: This study compares machine learning (ML) and logistic regression (LR) algorithms in developing a predictive model for sPCa using the seven predictive variables from the Barcelona (BCN-MRI) predictive model. Method: A cohort of 5005 men suspected of having PCa who underwent MRI and targeted and/or systematic biopsies was used for training, testing, and validation. A feedforward neural network (FNN)-based SimpleNet model (GMV) and a logistic regression-based model (BCN) were developed. The models were evaluated for discrimination ability, precision–recall, net benefit, and clinical utility. Both models demonstrated strong predictive performance. Results: The GMV model achieved an area under the curve of 0.88 in training and 0.85 in test cohorts (95% CI: 0.83–0.90), while the BCN model reached 0.85 and 0.84 (95% CI: 0.82–0.87), respectively (p > 0.05). The GMV model exhibited higher recall, making it more suitable for clinical scenarios prioritizing sensitivity, whereas the BCN model demonstrated higher precision and specificity, optimizing the reduction of unnecessary biopsies. Both models provided similar clinical benefit over biopsying all men, reducing unnecessary procedures by 27.5–29% and 27–27.5% of prostate biopsies at 95% sensitivity, respectively (p > 0.05). Conclusions: Our findings suggest that both ML and LR models offer high accuracy in sPCa detection, with ML exhibiting superior recall and LR optimizing specificity. These results highlight the need for model selection based on clinical priorities.

## Linked entities

- **Diseases:** prostate cancer (MONDO:0005159)

## Full-text entities

- **Diseases:** Prostate Cancer (MESH:D011471)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11987821/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11987821/full.md

## References

51 references — full list in the complete paper: https://tomesphere.com/paper/PMC11987821/full.md

---
Source: https://tomesphere.com/paper/PMC11987821