# Advancing modified barium swallow pre-sorting with deep learning: a new paradigm for the first step analysis in X-ray swallowing study

**Authors:** Shitong Mao, Mohamed A. Naser, Sheila Buoy, Kristy K. Brock, Katherine A. Hutcheson

PMC · DOI: 10.1007/s11548-025-03505-y · International Journal of Computer Assisted Radiology and Surgery · 2025-10-04

## TL;DR

This paper introduces a deep learning method to automatically sort and label modified barium swallow videos, improving efficiency in analyzing swallowing studies.

## Contribution

A novel deep learning approach is proposed to automate the pre-sorting of modified barium swallow exams using multi-task learning.

## Key findings

- The model achieved 99.68% frame-level and 100% video-level accuracy in differentiating AP from lateral planes.
- Multi-task learning improved video-level accuracy to 96.35% for distinguishing scout from bolus swallowing videos.

## Abstract

Modified barium swallow (MBS) exams are pivotal for assessing swallowing function and include diagnostic video segments imaged in various planes, such as anteroposterior (AP or coronal plane) and lateral (or mid-sagittal plane), alongside non-diagnostic ‘scout’ image segments used for anatomic reference and image set-up that do not include bolus swallows. These variations in imaging files necessitate manual sorting and labeling, complicating the pre-analysis workflow.

Our study introduces a deep learning approach to automate the categorization of swallow videos in MBS exams, distinguishing between the different types of diagnostic videos and identifying non-diagnostic scout videos to streamline the MBS review workflow. Our algorithms were developed on a dataset that included 3,740 video segments with a total of 986,808 frames from 285 MBS exams in 216 patients (average age 60 ± 9).

Our model achieved an accuracy of 99.68% at the frame level and 100% at the video level in differentiating AP from lateral planes. For distinguishing scout from bolus swallowing videos, the model reached an accuracy of 90.26% at the frame level and 93.86% at the video level. Incorporating a multi-task learning approach notably enhanced the video-level accuracy to 96.35% for scout/bolus video differentiation.

Our analysis highlighted the importance of leveraging inter-frame connectivity for improving the model performance. These findings significantly boost MBS exam processing efficiency, minimizing manual sorting efforts and allowing raters to allocate greater focus to clinical interpretation and patient care.

## Full-text entities

- **Chemicals:** barium (MESH:D001464)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12551561/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12551561/full.md

---
Source: https://tomesphere.com/paper/PMC12551561