# Leveraging Large Language Models and Machine Learning for Success Analysis in Robust Cancer Crowdfunding Predictions: Quantitative Study

**Authors:** Runa Bhaumik, Abhishikta Roy, Vineet Srivastava, Lokesh Boggavarapu, Ranganathan Chandrasekaran, Edward K Mensah, John Galvin

PMC · DOI: 10.2196/73448 · 2025-11-19

## TL;DR

This study uses advanced AI models to analyze crowdfunding campaigns for cancer patients, identifying key factors like communication and financial hardship that predict success.

## Contribution

The study introduces a novel framework combining large language models and machine learning to predict medical crowdfunding success with higher accuracy.

## Key findings

- Gradient boosting outperformed other algorithms in identifying successful campaigns with high sensitivity (0.786–0.798).
- Key predictors of success include medical severity, financial hardship, and empathetic communication.
- LLMs like GPT-4o extract nuanced linguistic and social features that improve predictive modeling.

## Abstract

Recent advances in large language models (LLMs) such as GPT-4o offer a transformative opportunity to extract nuanced linguistic, emotional, and social features from medical crowdfunding campaign texts at scale. These models enable a deeper understanding of the factors influencing campaign success far beyond what structured data alone can reveal. Given these advancements, there is a pressing need for an integrated modeling framework that leverages both LLM-derived features and machine learning algorithms to more accurately predict and explain success in medical crowdfunding.

This study addressed the gap of failure to capture the deeper psychosocial and clinical nuances that influence campaign success. It leveraged cutting-edge machine learning techniques alongside state-of-the-art LLMs such as GPT-4o to automatically generate and extract nuanced linguistic, social, and clinical features from campaign narratives. By combining these features with ensemble learning approaches, the proposed methodology offers a novel and more comprehensive strategy for understanding and predicting crowdfunding success in the medical domain.

We used GPT-4o to extract linguistic and social determinants of health features from cancer crowdfunding campaign narratives. A random forest model with permutation importance was applied to rank features based on their contribution to predicting campaign success. Four machine learning algorithms—random forest, gradient boosting, logistic regression, and elastic net—were evaluated using stratified 10-fold cross-validation, with performance measured through accuracy, sensitivity, and specificity.

Gradient boosting consistently outperformed the other algorithms in terms of sensitivity (consistently 0.786 to 0.798), indicating its superior ability to identify successful crowdfunding campaigns using linguistic and social determinants of health features. The permutation importance score revealed that for severe medical conditions, income loss, chemotherapy treatment, clear and effective communication, cognitive understanding, family involvement, empathy, and social behaviors play an important role in the success of campaigns.

This study demonstrates that LLMs such as GPT-4o can effectively extract nuanced linguistic and social features from crowdfunding narratives, offering deeper insights than traditional methods. These features, when combined with machine learning, significantly improve the identification of key predictors of campaign success, such as medical severity, financial hardship, and empathetic communication. Our findings underscore the potential of LLMs to enhance predictive modeling in health-related crowdfunding and support more targeted policy and communication strategies to reduce financial vulnerability among patients with cancer.

## Linked entities

- **Diseases:** cancer (MONDO:0004992)

## Full-text entities

- **Diseases:** Cancer (MESH:D009369)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12629620/full.md

---
Source: https://tomesphere.com/paper/PMC12629620