# Integrating Large Language Models with Deep Learning for Breast Cancer Treatment Decision Support

**Authors:** Heeseung Park, Serin Ok, Taewoo Kang, Meeyoung Park

PMC · DOI: 10.3390/diagnostics16030394 · Diagnostics · 2026-01-26

## TL;DR

This paper introduces a system that uses AI to combine pathology reports and medical records for more accurate breast cancer treatment decisions.

## Contribution

The novel integration of LLM-based pathology analysis with deep learning models for treatment prediction in breast cancer.

## Key findings

- GBM and XGBoost models achieved the highest predictive performance with macro-F1 scores of 0.88–0.89.
- DNN and Transformer models showed lower performance, indicating limited suitability for structured clinical data.
- The system demonstrates potential for improving treatment decision accuracy in real-world cancer care.

## Abstract

Background/Objectives: Breast cancer is one of the most common malignancies, but its heterogeneous molecular subtypes make treatment decision-making complex and patient-specific. Both the pathology reports and the electronic medical record (EMR) play a critical role for an appropriate treatment decision. This study aimed to develop an integrated clinical decision support system (CDSS) that combines a large language model (LLM)-based pathology analysis with deep learning-based treatment prediction to support standardized and reliable decision-making. Methods: Real-world data (RWD) obtained from a cohort of 5015 patients diagnosed with breast cancer were analyzed. Meta-Llama-3-8B-Instruct automatically extracted the TNM stage and tumor size from the pathology reports, which were then integrated with EMR variables. A multi-label classification of 16 treatment combinations was performed using six models, including Decision Tree, Random Forest, GBM, XGBoost, DNN, and Transformer. Performance was evaluated using accuracy, macro/micro-averaged precision, recall, F1 score, and AUC. Results: Using combined LLM-extracted pathology and EMR features, GBM and XGBoost achieved the highest and most stable predictive performance across all feature subset configurations (macro-F1 ≈ 0.88–0.89; AUC = 0.867–0.868). Both models demonstrated strong discrimination ability and consistent recall and precision, highlighting their robustness for multi-label classification in real-world settings. Decision Tree and Random Forest showed moderate but reliable performance (macro-F1 = 0.84–0.86; AUC = 0.849–0.821), indicating their applicability despite lower predictive capability. By contrast, the DNN and Transformer models produced comparatively lower scores (macro-F1 = 0.74–0.82; AUC = 0.780–0.757), especially when using the full feature set, suggesting limited suitability for structured clinical data without strong contextual dependencies. These findings indicate that gradient-boosting ensemble approaches are better optimized for tabular medical data and generate more clinically reliable treatment recommendations. Conclusions: The proposed artificial intelligence-based CDSS improves accuracy and consistency in breast cancer treatment decision support by integrating automated pathology interpretation with deep learning, demonstrating its potential utility in real-world cancer care.

## Linked entities

- **Diseases:** breast cancer (MONDO:0004989)

## Full-text entities

- **Diseases:** Breast Cancer (MESH:D001943), cancer (MESH:D009369)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12897023/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12897023/full.md

## References

38 references — full list in the complete paper: https://tomesphere.com/paper/PMC12897023/full.md

---
Source: https://tomesphere.com/paper/PMC12897023