# Segmentation of patients with small cell lung cancer into responders and non-responders using the optimal cross-validation technique

**Authors:** Elham Majd, Li Xing, Xuekui Zhang

PMC · DOI: 10.1186/s12874-024-02185-7 · 2024-04-08

## TL;DR

This paper introduces a new machine learning method to better classify small cell lung cancer patients into responders and non-responders, improving treatment decisions.

## Contribution

A novel data-driven cutoff selection method using optimal cross-validation for patient segmentation in cancer treatment.

## Key findings

- The novel method produced significantly different survival outcomes between responders and non-responders (p-value 0.009).
- The standard cutoff of 0.5 failed to show significant survival differences (p-value 0.194).
- The new method outperformed traditional approaches in segmenting patients for treatment decisions.

## Abstract

The timing of treating cancer patients is an essential factor in the efficacy of treatment. So, patients who will not respond to current therapy should receive a different treatment as early as possible. Machine learning models can be built to classify responders and nonresponders. Such classification models predict the probability of a patient being a responder. Most methods use a probability threshold of 0.5 to convert the probabilities into binary group membership. However, the cutoff of 0.5 is not always the optimal choice.

In this study, we propose a novel data-driven approach to select a better cutoff value based on the optimal cross-validation technique. To illustrate our novel method, we applied it to three clinical trial datasets of small-cell lung cancer patients. We used two different datasets to build a scoring system to segment patients. Then the models were applied to segment patients into the test data.

We found that, in test data, the predicted responders and non-responders had significantly different long-term survival outcomes. Our proposed novel method segments patients better than the standard approach using a cutoff of 0.5. Comparing clinical outcomes of responders versus non-responders, our novel method had a p-value of 0.009 with a hazard ratio of 0.668 for grouping patients using the Cox proportion hazard model and a p-value of 0.011 using the accelerated failure time model which approved a significant difference between responders and non-responders. In contrast, the standard approach had a p-value of 0.194 with a hazard ratio of 0.823 using the Cox proportion hazard model and a p-value of 0.240 using the accelerated failure time model indicating the responders and non-responders do not differ significantly in survival.

In summary, our novel prediction method can successfully segment new patients into responders and non-responders. Clinicians can use our prediction to decide if a patient should receive a different treatment or stay with the current treatment.

The online version contains supplementary material available at 10.1186/s12874-024-02185-7.

## Linked entities

- **Diseases:** small cell lung cancer (MONDO:0008433)

## Full-text entities

- **Diseases:** cancer (MESH:D009369), small cell lung cancer (MESH:D055752)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11000309/full.md

---
Source: https://tomesphere.com/paper/PMC11000309