# A vision transformer-radiomics approach for enhanced chemotherapy outcome prediction in ovarian cancer

**Authors:** Neman Abdoli, Patrik Gilley, Ke Zhang, Youkabed Sadri, Theresa Thai, Yong Chen, Lauren Dockery, Kathleen Moore, Robert Mannel, Yuchen Qiu

PMC · DOI: 10.3389/fradi.2026.1702977 · 2026-03-17

## TL;DR

This study uses advanced imaging techniques to better predict how ovarian cancer patients will respond to chemotherapy, helping tailor treatments more effectively.

## Contribution

The novel integration of Vision Transformer and MedSAM embeddings with radiomics features improves chemotherapy outcome prediction in ovarian cancer.

## Key findings

- The combined ViT and MedSAM model achieved an AUC of 0.924 for predicting chemotherapy response.
- Integrating all three feature groups (radiomics, ViT, and MedSAM) resulted in the highest classification accuracy of 0.831.

## Abstract

Early prediction of chemotherapy response in ovarian cancer patients is essential for enabling personalized treatment strategies and improving clinical outcomes. However, this prediction remains challenging due to the high heterogeneity of tumor biology, patient-specific factors, and treatment regimens. Recent advances in imaging biomarkers derived from both radiomics and advanced deep learning methods offer promising tools for characterizing tumor phenotypes and predicting treatment outcomes.

In this retrospective study, pre-treatment CT scans from 182 ovarian cancer patients were analyzed. Three categories of imaging features were extracted: handcrafted radiomics descriptors, embeddings from a pretrained Vision Transformer (ViT), and embeddings from MedSAM, a medical foundation model adapted for segmentation. All features were standardized and subjected to least absolute shrinkage and selection operator (LASSO) regression for feature selection. Support vector machine (SVM) classifiers were trained to predict 6-month progression-free survival (PFS). Model performance was evaluated using cross-validated metrics including area under the receiver operating characteristic curve (AUC) and classification accuracy.

The combined ViT and MedSAM embedding model achieved the highest AUC of 0.924 ± 0.032. Integration of all three feature groups (radiomics, ViT, and MedSAM) yielded a comparable AUC of 0.924 ± 0.037 and the highest classification accuracy of 0.831 ± 0.042.

These findings demonstrate that integrating complementary imaging representations enhances chemotherapy response prediction. The combination of transformer-based embeddings and radiomics features provides rich, task-specific tumor characterization from CT imaging and supports the development of precision oncology decision tools.

## Linked entities

- **Diseases:** ovarian cancer (MONDO:0005140)

## Full-text entities

- **Diseases:** ovarian cancer (MESH:D010051), tumor (MESH:D009369)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13036210/full.md

---
Source: https://tomesphere.com/paper/PMC13036210