# Machine learning enhanced acute heart failure phenotype prediction using natural language processing and random forest

**Authors:** Pei-Hsuan Chang, Feng-Ching Liao, Yi-Ching Wu, Fang-Ju Sun, Yen-Yu Liu, Hung-I Yeh, Chung-Lieh Hung, Kun-Pin Wu

PMC · DOI: 10.3389/frai.2025.1664627 · Frontiers in Artificial Intelligence · 2025-10-16

## TL;DR

This paper uses machine learning and natural language processing to predict heart failure types from patient data, aiming to improve early diagnosis and treatment.

## Contribution

The novel approach combines NLP and random forests with structured and unstructured data to predict acute heart failure phenotypes without relying on LVEF.

## Key findings

- The combined model using textual and lab data achieved an accuracy of 0.70 and AUROC of 0.76.
- Optimal performance was maintained with as few as 20 selected features from combined datasets.
- The model's performance was validated on an independent dataset, showing consistent results.

## Abstract

Heart failure (HF), with its distinct phenotypes, poses significant public health challenges. Early diagnosis of specific HF phenotypes is crucial for timely therapeutic intervention.

We employed random forests to predict acute HF (AHF) phenotypes (HFrEF, HFmrEF, and HFpEF) during admission, using structured and unstructured data types while blinded to left ventricular ejection fraction (LVEF) information.

We investigated the predictive performance of integrated natural language processing (NLP) and machine learning (ML)-based models in AHF phenotype classification by random forests, leveraging clinical text and laboratory data from the MIMIC-III database. Feature selection for unstructured textual data and biochemical test data was performed using the LASSO method, with selected textual features converted into structured data using one-hot encoding. The areas under the ROC and PRC curves (AUROC and AUPRC) assessed overall performance.

Our final study cohort comprised 1,192 training datasets and 513 independent validating datasets with primary data types and LVEF information available. The overall model from the training dataset showed the best performance with combined datasets (accuracy: 0.70 ± 0.03, AUROC: 0.76 ± 0.02) compared to the textual or laboratory dataset alone, which was replicated in the independent validating dataset. Our model achieved optimal performance by selecting up to 100 combined features from both textual and laboratory data. Reducing features to 20 did not substantially attenuate the overall model performance until only 10 features were selected.

Our study enhances HF phenotype classification and underscores the value of multifaceted data analysis in clinical informatics, enabling more personalized heart failure treatment. Early identification of AHF phenotypes may support timely, phenotype-specific management and inform treatment decisions.

## Linked entities

- **Diseases:** heart failure (MONDO:0005252)

## Full-text entities

- **Diseases:** AHF (MESH:D006333)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12571787/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12571787/full.md

## References

38 references — full list in the complete paper: https://tomesphere.com/paper/PMC12571787/full.md

---
Source: https://tomesphere.com/paper/PMC12571787