# Deep convolutional neural networks outperform vanilla machine learning when predicting language outcomes after stroke

**Authors:** Thomas M.H. Hope, Howard Bowman, Alex P. Leff, Cathy J. Price

PMC · DOI: 10.1016/j.nicl.2025.103880 · 2025-09-29

## TL;DR

Deep learning models outperform traditional machine learning in predicting language recovery after stroke, using brain scans directly without extra processing.

## Contribution

Deep CNNs achieve better performance than standard machine learning for post-stroke language outcomes using multi-input models.

## Key findings

- Deep CNNs outperformed baseline machine learning models across multiple language outcome scores.
- Using raw brain images instead of post-processed features did not hinder performance and improved results.
- The advantage of CNNs was consistent across different evaluation metrics and cross-validation folds.

## Abstract

•Recent research used machine learning to predict language outcomes after stroke.•We show that deep learning can outperform a strong baseline from that literature.•This advantage was consistent across many outcome scores.

Recent research used machine learning to predict language outcomes after stroke.

We show that deep learning can outperform a strong baseline from that literature.

This advantage was consistent across many outcome scores.

Current medicine cannot confidently predict patients’ language skills after stroke. In recent years, researchers have sought to bridge this gap with machine learning. These models appear to benefit from access to features describing where and how much brain damage these patients have suffered. Given the very high dimensionality of structural brain imaging data, those brain lesion features are typically post-processed from the images themselves into tabular features. With the introduction of deep Convolutional Neural Networks (CNN), which appear to be much more robust to high dimensional data, it is natural to hope that much of this image post-processing might be unnecessary. But prior attempts to demonstrate this (in the area of post-stroke prognostics) have so far yielded only equivocal results – perhaps because the datasets that those studies could deploy were too small to properly constrain CNNs, which are famously ‘data-hungry’.

The study draws on a much larger dataset than has been employed in previous work like this, referring to patients whose language outcomes were assessed once during the chronic phase post-stroke, on or around the same days as they underwent high resolution MRI brain scans. Following the model of our own and others’ past work, we use state of the art ‘vanilla’ machine learning models (boosted ensembles) to predict a variety of language and cognitive outcomes scores. These models employ both demographic variables and features derived from the brain imaging data, which represent where brain damage has occurred. These are our baseline models. Next, we use deep CNNs to predict the same language scores for the same patients, drawing on both the demographic variables, and post-processed brain lesion images: i.e., multi-input models with one input for tabular features and another for 3-dimensional images. We compare the models using 5 × 2-fold cross-validation, with consistent folds.

The CNN models consistently outperform the vanilla machine learning models, in this domain.

Deep CNNs offer state of the art performance when predicting language outcomes after stroke, outperforming vanilla machine learning and obviating the need to post-process lesion images into lesion features.

## Linked entities

- **Diseases:** stroke (MONDO:0005098)

## Full-text entities

- **Diseases:** brain damage (MESH:D001925), brain lesion (MESH:D001927), post-stroke (MESH:D020521), lesion (MESH:D009059)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/PMC12522715/full.md

---
Source: https://tomesphere.com/paper/PMC12522715